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INTERNATIONAL PRELIMINARY EXAMINATION AUTHORITY 



In the Application of: 

EJL DUPGNT DE NEMOURS AND COMPANY 
Case No: BC1003PCT 

International Application No.: PCT/US00/09723 
International Filing Date: 12/04/2000 

For: HOMOLOGS OF MAR-BINDING FILAMENT-LIKE PROTEIN 1 (MFP1) 



Wilmington Delaware, USA 
September 26, 2000 

AMENDMENT UNDER ARTICLE 34 

The European Patent Office 

Erhardtstrasse 27 

D-80298 Munich 2, Germany 

Sir: 

Please cancel Claims 1-20 and substitute the following fifty-six (56) claims: 

1 . An isolated nucleic acid fragment encoding a tobacco MFP1 protein selected 
from the group consisting of: 

(a) an isolated nucleic acid fragment encoding all or a substantialportion of 
the amino acid sequence selected from the group consisting of SEQ ID 
NO:2, and SEQ ID NO:4; 

(b) an isolated nucleic acid fragment that is substantially similar to an 
isolated nucleic acid fragment encoding all or a substantial portion of the 
amino acid sequence selected from the group consisting of SEQ ID 
NO:2, and SEQIDNO;4; 
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(c) an isolated nucleic acid molecule that hybridizes with a nucleic acid 
sequence of (a) or (b) under the following hybridization conditions: 5 x 
Denhards, 5 x SSPE, 5% SDS, 20 ug/mL salmon sperm DNA at 55 °C; 

(d) an isolated nucleic acid molecule that hybridizes with a nucleic acid 
sequence selected from the group consisting of SEQ ID NO:l, SEQ ID 
NO:3, SEQ ID NO: 14, and SEQ ID NO. 15, under the following 
hybridization conditions: 5 x Denhards, 5 x SSPE, 5% SDS, 20 ug/mL 
salmon sperm DNA at 55 "C; and 

(e) an isolated nucleic acid fragment that is complementary to (a), (b), (c) or 

(d). 

2 The isolated nucleic acid fragment of Claim 1 selected from the group 
consisting of SEQ ID NO:l, SEQ ID NO.3, SEQ ID NO:14, and SEQ ID NO:15. 

3. A polypeptide encoded by the isolated nucleic acid fragment of Claim 1 . 

4. The polypeptide of Claim 3 selected from the group consisting of SEQ ID 

NO:2,and SEQIDNO:4. 

5. An isolated nucleic acid fragment encoding a tobacco MFP1 polypeptide, the 
peptide having at least 77% identity to SEQ ID NO: 17. 

6. An MFP1 polypeptide encoded by the nucleic acid fragment of Claim 5. 

7. A chimeric gene comprising the isolated nucleic acid fragment of either of 
Claims 1 or 5 operably linked to suitable regulatory sequences. 

8. A transformed host cell comprising a host cell and the chimeric gene of 



Claim 7. 
9. 
10. 
U. 



The transformed host cell of Claim 8 wherein the host cell is a plant cell. 
The transformed host cell of Claim 8 wherein the host cell is E. coli. 
A method of altering the level of expression of a plant MFP1 protein in a 

host cell comprising: 

(a) transforming a host cell with the chimeric gene of Claim 7 and; 

(b) growing the transformed host cell produced in step (a) under conditions 
that are suitable for expression of the chimeric gene 

resulting in production of altered levels of a plant MFP1 protein in die transformed host 
cell relative to expression levels of an untransformed host cell. 

12. A method of obtaining a nucleic acid fragment encoding all or a substantial 
portion of the amino acid sequence encoding a tobacco MFP1 protein comprising: 

(a) probing a cDNA or genomic library with the nucleic acid fragment of 
Claim 1; 
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(b) identifying a DNA clone that hybridizes with the nucleic acid fragment 
of Claim l;and 

(c) sequencing the cDNA or genomic fragment that comprises the clone 
identified in step (b), 

wherein the sequenced cDNA or genomic fragment encodes a tobbacco MFPl protein 
13. A method of obtaining a nucleic acid fragment encoding all or a substantial 
portion of the amino acid sequence encoding a tobacco MFPl protein comprising: 

(a) synthesizing at least one oligonucleotide primer corresponding to a 
portion of the sequence selected from the group consisting of SEQ ID 
NO:l, SEQ ID NO:3, SEQ ID NO:14, and SEQ ID NO:15; 

(b) amplifying a cDN A insert present in a cloning vector using the 
oligonucleotide primer of step (a); 

wherein the amplified cDNA insert encodes a tobacco MFPl protein. 
14. The product of the method of Claims 12 or 13. 
15. An isolated nucleic acid fragment encoding a soybean MFP 1 protein 
selected from the group consisting of: 

(a) an isolated nucleic acid fragment encoding all or a substantialportion of 
the amino acid sequence as set forth in SEQ ID NO:20; 

(b) an isolated nucleic acid fragment that is substantially similar to an 
isolated nucleic acid fragment encoding all or a substantial portion of the 
amino acid sequence as set forth in SEQ ID NO:20; 

(c) an isolated nucleic acid molecule that hybridizes with a nucleic acid 
sequence of (a) or (b) under the following hybridization conditions: 5 x 
Denhards, 5 x SSPE, 5% SDS, 20 ng/mL salmon sperm DNA at 55 °C; 

(d) an isolated nucleic acid molecule that hybridizes with a nucleic acid 
sequence as set forth in SEQ ID NO:19 under the following 
hybridization conditions: 5 x Denhards, 5 x SSPE, 5% SDS, 20 ng/mL 
salmon sperm DNA at 55 °C; and 

(e) an isolated nucleic acid fragment that is complementary to (a), (b), (c) or 

(d). 

16. The isolated nucleic acid fragment of Claim 15 as set forth in SEQ ED 
NO: 19. 

17. A polypeptide encoded by the isolated nucleic acid fragment of Claim 15. 

18. The polypeptide of Claim 1 7 as set forth in SEQ ID NO;20. 
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19. A nucleic acid fragment, isolated from soybean, encoding an MFP1 
polypeptide, the polypeptide having at least 46% identity to SEQ ID NO: 1 7 over a length 
of 388 amino acids as compared by the Jotun-Hein algorithm. 

20. An MFPl polypeptide encoded by the nucleic acid fragment of Claim 1 9. 

21 . A chimeric gene comprising the isolated nucleic acid fragment of either of 
Claims 15 or 19 operably linked to suitable regulatory sequences, 

22. A transformed host cell comprising a host cell and the chimeric gene of 
Claim 21. 

23. The transformed host cell of Claim 22 wherein the host cell is a plant cell. 

24. The transformed host cell of Claim 22 wherein the host cell is E. colt 

25. A method of altering the level of expression of a plant MFPl protein in a 

host cell comprising: 

(a) transforming a host cell with the chimeric gene of Claim 21 and; 

(b) growing the transformed host cell produced in step (a) under conditions 
that are suitable for expression of the chimeric gene 

resulting in production of altered levels of a plant MFPl protein in the transformed host 
cell relative to expression levels of an untransformed host cell 

26. A method of obtaining a nucleic acid fragment encoding all or a substantial 
portion of the amino acid sequence encoding a soybean MFPl protein comprising: 

(a) probing a cDNA or genomic library with the nucleic acid fragment of 
Claim 15; 

(b) identifying a DNA clone that hybridizes with the nucleic acid fragment 
of Claim 15; and 

(c) sequencing the cDNA or genomic fragment that comprises the clone 
identified in step (b), 

wherein the sequenced cDNA or genomic fragment encodes a soybean MFPl protein 

27. A method of obtaining a nucleic acid fragment encoding all or a substantial 
portion of the amino acid sequence encoding a soybean MFPl protein comprising: 

(a) synthesizing at least one oligonucleotide primer corresponding to a 
portion of the sequence as set forth in SEQ ID NO: 19; 

(b) amplifying a cDNA insert present in a cloning vector using the 
oligonucleotide primer of step (a); 

wherein the amplified cDNA insert encodes a corn MFPl protein. 
28. The product of the method of Claims 26 or 27. 

29. An isolated nucleic acid fragment encoding a corn MFPl protein selected 
: from the group consisting of: 



Prinlied:02-;08-2001 



AMENDED SHEET 



08/08 01 WED 14:42 [TX/RX NO 6265] 



, 8. AUG. 2001 15:44 EPA MUENCHJN_, + 4JJ9 ^94J6_5^_ NR.™ 3 J; » 6/1! 



• >•• •• 

• • < 

• • < 
• • • < 



• • * • • 

» • • * t I 
• 



(a) an isolated nucleic acid fragment encoding all or a substantial jortion of 
the amino acid sequence as set forth in SEQ ID NO:22; 

(b) an isolated nucleic acid fragment that is substantially similar to an 
isolated nucleic acid fragment encoding all or a substantial portion of the 
amino acid sequence as set forth in SEQ ID NO:22; 

(c) an isolated nucleic acid molecule that hybridizes with a nucleic acid 
sequence of (a) or (b) under the following hybridization conditions: 5 x 
Denhards, 5 x SSPE, 5% SDS, 20 ug/mL salmon sperm DNA at 55 °C; 

(d) an isolated nucleic acid molecule that hybridizes with a nucleic acid 
sequence as set forth in SEQ ID NO:21 under the following 
hybridization conditions: 5 x Denhards, 5 x SSPE. 5% SDS, 20 p.g/mL 
salmon sperm DNA at 55 °C; and 

(e) an isolated nucleic acid fragment that is complementary to (a), (b), (c) or 

(d). 

30. The isolated nucleic acid fragment of Claim 29 as set forth m SEQ ID 
NO:21. 

31. A polypeptide encoded by the isolated nucleic acid fragment of Claim 29. 

32. The polypeptide of Claim 31 as set forth in SEQ ID NO:22. 

33 A nucleic acid fragment, isolated from com, encoding an MFP1 polypeptide, 
the polypeptide having at least 40% identity to SEQ ID NO:17, over a length of about 
672 amino acids as compared by the Jotun-Hein algorithm. 

34. An MFP1 polypeptide encoded by the nucleic acid fragment of Claim 33. 

35. A chimeric gene comprising the isolated nucleic acid fragment of either of 
Claims 29 or 33 operably linked to suitable regulatory sequences. 

36. A transformed host cell comprising a host cell and the chimeric gene of 

Claim 35. 

37. The transformed host cell of Claim 36 wherein the host cell is a plant cell. 

38. The transformed host cell of Claim 36 wherein the host cell is E. coli. 

39. A method of altering the level of expression of a plant MFP1 protein in a 

host cell comprising: 

(a) transforming a host cell with the chimeric gene of Claim 35 and; 

(b) growing the transformed host cell produced in step (a) under conditions 
that are suitable for expression of the chimeric gene 

resulting in production of altered levels of a plant MFP1 protein in the transformed host 
cell relative to expression levels of an untransformed host cell- 
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40. A method of obtaining a nucleic acid fragment encoding all or a substantial 
portion of the amino acid sequence encoding a corn MFP1 protein comprising: 

(a) probing a cDNA or genomic library with the nucleic acid fragment of 
Claim 29; 

(b) identifying a DNA clone that hybridizes with the nucleic acid fragment 
of Claim 29; and 

(c) sequencing the cDNA or genomic fragment that comprises the clone 
identified in step (b\ 

wherein the sequenced cDNA or genomic fragment encodes a com MFP1 protein 

41, A method of obtaining a nucleic acid fragment encoding all or a substantial 
portion of the amino acid sequence encoding a coin MFP1 protein comprising: 

(a) synthesizing at least one oligonucleotide primer corresponding to a 
portion of the sequence as set forth in SEQ ID NO:21 ; 

(b) amplifying a cDNA insert present in a cloning vector using the 
oligonucleotide primer of step (a); 

wherein the amplified cDNA insert encodes a com MFP1 protein. 
42. The product of the method of Claims 40 or 4 1 . 

43. An isolated nucleic acid fragment encoding a rice MFP1 protein selected 

from the group consisting of: 

(a) an isolated nucleic acid fragment encoding ail or a substantial .portion of 
the amino acid sequence as set forth in SEQ ID NO:24; 

(b) an isolated nucleic acid fragment that is substantially similar to an 
isolated nucleic acid fragment encoding all or a substantial portion of the 
amino acid sequence as set forth in SEQ ID NO:24; 

(c) an isolated nucleic acid molecule that hybridizes with a nucleic acid 
sequence of (a) or (b) under the following hybridization conditions: 5 x 
Denhards, 5 x SSPE, 5% SDS, 20 jig/mL salmon sperm DNA at 55 °C; 

(d) an isolated nucleic acid molecule that hybridizes with a nucleic acid 
sequence as set forth in SEQ ID NO:23 under the following 
hybridization conditions: 5 x Denhards, 5 x SSPE, 5% SDS, 20 fig/mL 
salmon sperm DNA at 55 °C; and 

(e) an isolated nucleic acid fragment that is complementary to (a), (b), (c) or 
(d). 

44. The isolated nucleic acid fragment of Claim 43 as set forth in SEQ ID 



NO:23. 



45. A polypeptide encoded by the isolated nucleic acid fragment of Claim 43. 
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46. The polypeptide of Claim 45 as set forth in SEQ ID N024 

47. A nucleic acid iragmenUso^ 

polypeptide, the polypeptide having at least 39% identity to SEQ ID NO:17 over a length 
of 107 ammo acids as compared by the Jotun-Hein algorithm. 

48. AnMFPl polypeptide encoded by the nucleic acid fragment of Claim 47 
n- l\ A ^^ c e^ uprising the isolated nucleic acid fragment of either of 
Claims 43 or 47 operably linked to suitable regulatory sequences. 

50. A transformed host cell comprising a host cell and the chimeric gene of 



Claim 49. 
SI. 
52. 
53. 



The transformed host cell of Claim 50 wherein the host cell is a plant cell 
The transformed host cell of Claim 50 wherein the host cell i s £ coH 

Ame&odofalt ^g*elevdofexpres S io„ofaplantMFPlproteinin a 
nost cell comprising: 

(a) transforming a host cell with the chimeric gene of Claim 43 and- 

(b) growing the transformed host cell produced in step (a) under conditions 
that are suitable for expression of the chimeric gene 

raiting in production of altered ievels of a plant MFP1 protein in the transformed host 
cell relanve to expression levels of an untransformed host cell. 

« * 5 * Ame *° d0f ° bt ^ 

portion of the ammo acid sequence encoding a rice MFPl protein comprising- 

(a) probing a cDNA or genomic library with the nucleic acid fragment of 
Claim 43; 

(b) identifying a DNA clone that hybridizes with the nucleic acid fragment 
ofClaim43;and 

(c) sequencing the cDNA or genomic fragment that comprises the clone 
identified in step (b), 

wherein the sequenced cDNA or genomic fragment encodes a rice MFPl protein 

55 A method of obtaining a nucleic acid fragment encoding all or a substantial 
portion of the ammo acid sequence encoding a rice MFPl protein comprising- 

(a) synthesizing at least one oligonucleotide primer corresponding to a 
portion of the sequence as set forth in SEQ ID NO:23; 

(b) amplifying a cDNA insert present in a cloning vector using the 
oligonucleotide primer of step (a); 

wherein the amplified cDNA insert encodes a rice MFPl pro tein. 
56. The product of the method of Claims 54 or 55. 
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DISCUSSION 

By virtue of the these amendments Applicant's invention is drawn to nucleic 
acids and polypeptides corresponding to Tobacco MFP1 (Claims 1-14), Soybean MFP1 
(Claims 15-28), Com MFP1 (Claims 29-42) and Rice MFP1 (Claims 43-56), and 
genetic chimera and transformants containing the same, and their methods of use. 



Very truly yours, 

S. NEIL FELTHAM 

Attorney for Applicant 
Registration No.: 36,506 
Telephone: (302)992-6460 
Fax No.: (302)892-7949 
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, FILAfr^TVLIKE PROTEIN 1 (MFP1) 



This application claims the benefit of U.S. Provisional Application 

5 No. 60/128,900, filed April 12, 1999. 

FIELD OF THE INVENTION 
This invention is in the field of plant molecular biology. This invention 
pertains to nucleic acid fragments encoding proteins that are homolgs to the 
MAR-binding filament-like protein 1 (MFP1) from tomato. More specifically, 

1 0 this invention pertains to two tobacco MFP1 genes and MFP1 homologs from 
corn, soybean and rice. 

BACKGROUND OF THE INVENTION 
The nuclear matrix hypothesis proposes a structural framework for the 
eukaryotic nucleus that is similar to the cytoskeleton. To date, its best 

1 5 characterized component is the lamina, a filamentous protein network that lines 
the inner membrane of the nuclear envelope. Major components of the lamina 
include a group of intermediate-filament (IF) proteins, collectively known as 
nuclear lamins, that are classified as type A, B, and C (McKeon et al, Nature 
3 19:463-468 (1986)). Lamin B is attached to the inner nuclear membrane via a 

20 C-terminal CI 5 farnesyl group (Schafer et al., Annu. Rev. Genet. 30:209-237 
(1992)), whereas lamins A and C bind to larnin B. Other integral membrane 
proteins interact with lamin B and most likely stabilize the membrane attachment 
oflamin5(Furukawaetal.,£MBOJ. 14:1626-1636(1995)). Recent studies have 
also demonstrated the ability of lamins A and B to bind DNA, suggesting a role 

25 for mammalian lamins in anchoring chromatin to the nuclear envelope. The 

interaction between nuclear envelope, lamina, and chromatin is considered to be 
of fundamental importance for higher order chromosome organization, as well as 
the assembly and disassembly of the nuclear envelope during mitosis (Furukawa 
etaUEMBOJ. 14:1626-1636(1995)). 

30 The nuclear matrix is a second structural skeleton that has been 

biochemically defined as the insoluble component that remains after treatment of 
isolated nuclei with DNase I and extraction of proteins with high-salt solutions 
(Berezney et al., Biochem. Biophys. Res. Comm. 60:1410-1417 (1974)) or the 
chaotropic agent lithium diiodosalicylate (Mirkowitch et al., Cell 39:223-232 

35 (1 984)). Chromatin binds to the nuclear matrix via matrix attachment regions 
(MARs) in the DNA. MARs are generally AT-rich DNA sequences that are 
several hundred base pairs long and localized to noncoding regions of the DNA, 
but often flanking genes (Gasser et al., Trends Genet. 3:16-22 (1987)). However, 
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there is no consensus sequence known for MARs. The significance of structural 
characteristics for MARs such as DNA bending and a narrow minor groove due to 
oligo(dA) tracts has been previously proposed. MARs have been shown to 
increase transcriptional activity of a linked gene and to confer position- 
5 independent, copy-number dependent expression in stably transfected cells 
(Phi-Wan et al., EMBO J. 7:655-664 (1988)). 

A small number of MAR binding proteins have been identified from 
animal nuclei, and they are considered to be components of the nuclear matrix 
(von Kries et al., Cell 64:123-135 (1991); Dickinson et al., Cell 70:631-645 

10 (1992); Romig et al., EMBO J. 11:3431-3440 (1992); Tsutsui et al., J. Biol Chem. 
268:12886-12894 (1993); Renz et al., Nucleic Acids Res, 24:843-849 (1996); U.S. 
5,652,340). In addition, it has been shown that lamins specifically bind to MARs 
(Luderus et al., Mol Cell Biol 14:6297-6305 (1994)). The specific interaction 
between DNA and the nuclear matrix/nuclear lamina is most likely an important 

1 5 mechanism for long-range gene regulation and higher order chromatin 
organization (Gasser et al., Trends Genet. 3:16-22 (1987)). 

Most investigations into structural components of the nucleus have 
focused on proteins in vertebrates and Drosophila, but even in these organisms, 
our knowledge about the molecular constituents of the nuclear matrix is sparse. 

20 Significantly less information is available for other eukaryotes, and in particular 
for plants. Proteins that are immunologically related to animal IF proteins and 
lamins have been detected in pea and carrot nuclei (Beven et al., J. Cell Sci, 
(1991) 98 (3), 293-30; McNulty et al. s X Cell Set 103:407-414 (1992)). Plant 
nuclear matrix preparations that bind to animal MARs have been reported, 

25 suggesting that proteins with similar DNA binding specificities exist in plants as 
well (Hall et al., Proc, Natl Acad, Sci, USA 88:9320-9324 (1991)). 

Effects of MARs on gene expression in plants have been reported, but 
have been quite variable. In some experimental systems, no reduction of 
variability but an increase in expression level has been reported (Breyne et al., 

30 Plant Cell 4:463-471 (1992); Allen et ah, Plant Cell 5:603-613 (1993); Allen 

et al., Plant Cell 8:899-913 (1996); U.S. 5,773,689). Other authors have found no 
significant increase in expression level, but a reduction of variability 
(van der Geest et al., Plant J. 6:413-423 (1994); Mlynarova et al., Plant Cell 
6:41 7-426 (1994)). It is not clear what causes these observed differences, but they 
35 will most probably be due to the fact that MARs establish different molecular 
interactions, which might either depend on the features of the MAR itself or on 
the specific molecular environment of the transformed cell/tissue. The routine use 
of MARs for strategies to improve transgene expression will greatly depend on the 
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characterization of the proteins involved in DNA-nuclear matrix attachment and 
the factors responsible for the observed increase in gene expression. 

Currently, no sequence information is available for plant lamin-like 
proteins. However, the cloning of the cDNA for a plant MAR-binding protein, 
5 MFP1, from tomato has been reported (Meier et aL, Plant Cell 8:2105-21 15 
(1996))! MFP1 has structural features of a filament-like protein and it 
preferentially binds to MAR DNA sequences from both plants and animals. In 
contrast to other known MAR binding proteins, MFP1 contains a hydrophobic 
N-terminal amino acid sequence that might function as a membrane-spanning 
1 0 domain. MFP 1 , therefore, has features of a novel anchor protein that most likely 
connects chromatin via MAR DNA with the nuclear envelope and nuclear 
filament proteins. 

In order to routinely use the attachment of transgenes to the nuclear matrix 
improve gene expression, it will be necessary to further characterize the elements 

15 involved in this process and to better understand the underlying mechanisms. 

Thus, a need exists to identify and characterize additional nuclear matrix proteins. 
The present invention presents MFP 1 -like proteins from other plant species. 
Furthermore, the present invention shows that a single, immunologically related 
protein of comparable size is present in a variety of higher-plant species, including 

20 important crop plants. This invention pertains to the isolation of cDNAs 

corresponding to two tobacco MFP1 genes and the characterization of the MFP1 
gene family in tobacco. The invention also pertains to the identification and 
partial characterization of EST sequences from com, soybean and rice encoding 
MFP1 proteins from these crop species. 

25 SUMMARY OF THE INVENTION 

The present invention provides an isolated nucleic acid fragment encoding 
a plant MFP1 protein selected from the group consisting of: (a) an isolated 
nucleic acid fragment encoding all or a substantial portion of the amino acid 
sequence selected from the group consisting of SEQ ID NO:2, SEQ ID NO:4> 

30 SEQ ID NO:20, SEQ ID NO:22 and SEQ ID NO:24; (b) an isolated nucleic acid 
fragment that is substantially similar to an isolated nucleic acid fragment 
encoding all or a substantial portion of the amino acid sequence selected from the 
group consisting of SEQ ID NO:2, SEQ ID NO;4, SEQ ID NO:20, SEQ ID 
NO:22 and SEQ ID NO:24; (c) an isolated nucleic acid molecule that hybridizes 

35 with a nucleic acid sequence of (a) or (b) under the following hybridization 

conditions: 5 x Denhards, 5 x SSPE, 5% SDS, 20 jig/mL salmon sperm DNA at 
55 °C; (d) an isolated nucleic acid molecule that hybridizes with a nucleic acid 
sequence selected from the group consisting of SEQ ID NO:l, SEQ ID NO:3, 
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SEQ ID NO:l 1, SEQ ID NO:12, SEQ ID NO:13, SEQ ID NO:14, SEQ ID 
NO: 15, SEQ ID NO: 19, SEQ ID NO:21 and SEQ ID NO:23 under the following 
hybridization conditions: 5 x Denhards, 5 x SSPE, 5% SDS, 20 |ag/mL salmon 
sperm DNA at 55 °C; and (e) an isolated nucleic acid fragment that is 
5 complementary to (a), (b), (c) or (d). 

Additionally the invention provides a nucleic acid fragment, isolated from 
corn, encoding an MFP1 polypeptide, the polypeptide having at least 40% 
identity to SEQ ID NO: 17, over a length of about 672 amino acids as compared 
by the Jotun-Hein algorithm. 
10 Similarly the invention provides a nucleic acid fragment, isolated from 

soybean, encoding an MFP1 polypeptide, the polypeptide having at least 46% 
identity to SEQ ID NO: 17 over a length of 388 amino acids as compared by the 
Jotun-Hein algorithm. 

In another embodiment the invention provide a nucleic acid fragment, 
15 isolated, from rice, encoding an MFP1 polypeptide, the polypeptide having at 
least 39% identity to SEQ ID NO: 17 over a length of 107 amino acids as 
compared by the Jotun-Hein algorithm. 

In an alternate embodiment the invention provides an isolated nucleic acid 
fragment encoding a plant MFP1 polypeptide, the peptide having at least 77% 
20 identity to SEQ ID NO: 1 7. 

The invention further provides polypeptides encoded by the isolated 
nucleic acid fragments of the present invention. 

In another embodiment the invention provides a chimeric gene comprising 
the isolated nucleic acid fragment of the present invention operably linked to 
25 suitable regulatory sequences. 

The invention additionally provides a method of altering the level of 
expression of a plant MFP1 protein in a host cell comprising: (a) transforming a 
host cell with the chimeric gene of the present invention and; (b) growing the 
transformed host cell produced in step (a) under conditions that are suitable for 
30 expression of the chimeric gene resulting in production of altered levels of a plant 
MFP1 protein in the transformed host cell relative to expression levels of an 
untransformed host cell. 

The invention additionally provides transformed host cells comprising the 
chimeric genes of the present invention. 
35 In an alternate embodiment the invention provides methods of obtaining a 

nucleic acid fragment encoding all or a substantial portion of the amino acid 
sequence encoding a plant MFP1 protein using portions of the present nucleic 
acid sequences as hybridization probes or as primers. 
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BRIEF DESCRIPTION OF THE DRAWINGS. 
AND SEQUENCE DESCRIPTIONS 
Figure 1 shows a schematic representation of the subfragments E-196 and 
H-207 that were expressed in Escherichia colt. 
5 Figure 2 A is a gel showing the immunological identification of MFPl-like 

proteins in different plant species using the aR50 antibody raised against a Le 
MFP1 polypeptide. 

Figure 2B is a gel showing the immunological identification of MFPl-like 
proteins in different plant species using the a288 antibody raised against a Le 
10 MFP1 polypeptide. 

Figure 3 shows the schematic structure of the partial cDNAs isolated from 
a Nicotiana tabacum lambda ZAP cDNA library. 

Figure 4A shows the percent identical amino acids in pairwise 
comparisons of the four MFP1 proteins. 
1 5 Figure 4B shows the hydrophilicity and secondary structure analysis of 

LeMFP 1 , NtMFP 1-1 and AtMFPl. 

Figure 5 shows the genomic organization of tobacco MFP1. 
The invention can be more fully understood from the following detailed 
description and the accompanying sequence descriptions which form part of this 
20 application. 

The following sequence descriptions and sequence listings attached hereto 
comply with the rules governing nucleotide and/or amino acid sequence 
disclosures in patent applications as set forth in 37 C.F.R. §1 .821-1.825. The 
Sequence Descriptions contain the one letter code for nucleotide sequence 
25 characters and the three letter codes for amino acids as defined in conformity with 
the IUPAC-IYUB standards described in Nucleic Acids Research 13:3021-3030 
(1985) and in the Biochemical Journal 219(2):345-373 (1984) which are herein 
incorporated by reference. The symbols and format used for nucleotide and 
amino acid sequence data comply with the rules set forth in 37 C.F.R. § 1 .822. 
30 SEQ ID NO : 1 is the nucleotide sequence for NtMFP 1 - 1 . 

SEQ ID NO:2 is the deduced amino acid sequence for NtMFP 1-1 , encoded 
by SEQ ID NO: 1. 

SEQ ID NO:3 is the nucleotide sequence for NtMFP 1-2. 

SEQ ID NO:4 is the deduced amino acid sequence for NtMFP 1-2, encoded 
35 by SEQIDNO:3. 

SEQ ID NO: 5 is the nucleotide sequence which codes for E-196 
polypeptide fragment isolated from tomato. 
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SEQ ID NO;6 is the deduced amino acid sequence for E-196 polypeptide 
fragment isolated from tomato, encoded by SEQ ID NO:5. 

SEQ ID NO:7 is the nucleotide sequence which codes for H-207 
polypeptide fragment isolated from tomato. 



isolated from tomato, encoded by SEQ ED NO:8. 

SEQ ID NO:9 is the nucleotide sequence for the p7-2 fragment isolated 
from tomato. 

SEQ ID NO: 1 0 is the nucleotide sequence for the p 1*3 fragment isolated 
10 from tomato. 

SEQ ID NO: 1 1 is the nucleotide sequence for the T6 fragment isolated 
from tobacco. 

SEQ ID NO: 12 is the nucleotide sequence for the Tl fragment isolated 
from tobacco. 

15 SEQ ID NO: 13 is the nucleotide sequence for the T2 fragment isolated 

from tobacco. 

SEQ ID NO: 14 is the nucleotide sequence for the T3 fragment isolated 
from tobacco. 

SEQ ID NO: 15 is the nucleotide sequence for the PCR1 fragment isolated 
20 from tobacco. 

SEQ ID NO: 1 6 is the nucleotide sequence for LeMFPl . 

SEQ ID NO: 17 is the deduced amino acid sequence for LeMFPl, encoded 
by SEQ ID NO: 16. 

SEQ ID NO: 1 8 is the nucleotide sequence used as a Southern probe. 
25 SEQ ID NO: 1 9 is the nucleotide sequence comprising the cDNA insert in 

clone src3c.pk004.ini encoding a soybean MFP1 (GmMFPl). 

SEQ ID NO:20 is the deduced amino acid sequence of the nucleotide 
sequence comprising the cDNA insert in clone src3c.pk004.ml. 

SEQ ID NO:21 is the nucleotide sequence comprising the cDNA insert in 
30 clone pO 1 1 8 .chsab48r encoding a com MFP 1 . 

SEQ ID NO:22 is the deduced amino acid sequence of the nucleotide 
sequence comprising the cDNA insert in clone pOl 18.chsab48r. 

SEQ ID NO:23 is the nucleotide sequence comprising the cDNA insert in 
clone rcaln.pk022.al 1 encoding a rice MFP1. 
35 SEQ ID NO:24 is the deduced amino acid sequence of the nucleotide 

sequence comprising the cDNA insert in clone rcaln.pk022.al 1 . 

SEQ ID NO:25 is the nucleotide sequence for PCR primer designed from 
T3 fragment isolated from tobacco. 
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SEQ ID NO:26 is the nucleotide sequence for PCR primer designed from 
Tl fragment isolated from tobacco. 

DETAILED DESCRIPTION OF THE INVENTION 
The present invention reports the isolation and characterization of cDNAs 
5 corresponding to two tobacco MFP1 genes and the isolation and identification of 
MFP1 EST homologs from com, soybean and rice. No homologs of MFP1 from 
tobacco have been described previously. The level of expression of the genes 
described here can be altered in the plant by methods of cosuppression and 
overexpression. As they are previously undescribed genes involved in a 
10 fundamental cellular mechanism, this can lead to novel developmental phenotypes 
that might be beneficial for crop growth and development. In addition, if the 
reduction in expression of one of the genes leads to a growth or developmental 
defect in the plant, this gene can be used as a novel herbicide target. All isolated 
proteins can be used as tools to study the plant nuclear matrix, of which no 
15 components have been isolated at the molecular level. This can lead to the 

identification of additional proteins, that can be used as described above. Any 
related EST sequences can be directly used for the above described applications in 
crop plants. All of these sequences can be directly used to broaden our 
understanding of the mechanisms of MAR-matrix interactions and the molecular 
20 basis for the described effects on gene expression. 

The following definitions are provided for the full understanding of terms 
and abbreviations used in this specification. 

"Polymerase chain reaction" is abbreviated PCR. 
"Expressed sequence tag" is abbreviated EST. 
25 "Open reading frame" is abbreviated ORJF. 

"SDS polyacrylamide gel electrophoresis" is abbreviated SDS-PAGE. 
"Amino acid" is abbreviated AA. 
"Plaque-forming units" is abbreviated pfus. 
"a-Helical" is abbreviated AH. 
30 "Coiled-coil" is abbreviated CC and refers to an amphiphillic a-helical 

protein structure. 

"Hydrophilicity plot" is abbreviated HP. 

"Matrix attachment region" is abbreviated MAR. MARs are also known 
as matrix-associated regions or scaffold-associated (or attachment) regions. 
35 The term "MFP" is an abbreviation for MAR-binding filament-like 

protein. "MFP 1 " refers to the MAR-binding filament-like protein having similar 
characteristics to the protein isolated from tomato as described in Meier et al., 
Plant Cell 8:2105-21 15 (1996). "LeMFPl" is the abbreviation for the specific 
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MFP1 protein isolated from tomato, as set forth in SEQ ID NO:17. "NtMFPl-1" 
and "NtMFPl-2" are the abbreviations for the first and second MFP1 proteins 
isolated from tobacco, as set forth in SEQ ID NO:2 and 4 respectively. 
"GrnMFPl" is the abbreviation for the MFP1 protein isolated from soybean, as 
5 set forth in SEQ ID NO:20. "ZmMFPl" is the abbreviation for the MFP1 protein 
isolated from com, as set forth in SEQ ID NO:22. "OsMFPl" is the abbreviation 
for the MFP1 protein isolated from rice, as set forth in SEQ ID NO:24. 
"AtMFPl" is the abbreviation for the MFP1 protein isolated from Arabidopsis, 
released on (http://genomewww.standard.edii/Arabidopsis/). 

1 0 The terms "isolated nucleic acid fragment" or "isolated nucleic acid 

molecule" refer to a polymer of RNA or DNA that is single- or double-stranded, 
optionally containing synthetic, non-natural or altered nucleotide bases. An 
isolated nucleic acid fragment or an isolated nulceic acid molecule in the form of 
a polymer of DNA may be comprised of one or more segments of cDNA, 

15 genomic DNA, or synthetic DNA. 

The terms "host cell" and "host organism" refer to a cell capable of 
receiving foreign or heterologous genes and expressing those genes to produce an 
active gene product. Suitable host cells include microorganisms such as bacteria 
and fungi, as well as plant cells. 

20 The term "fragment" refers to a DNA or amino acid sequence comprising 

a subsequence of the nucleic acid sequence or protein of the present invention. 
However, an active fragment of the present invention comprises a sufficient 
portion of the protein to maintain activity. 

The term "substantially similar" refers to nucleic acid fragments wherein 

25 changes in one or more nucleotide bases result in substitution of one or more 

amino acids, but do not affect the functional properties of the protein encoded by 
the DNA sequence. "Substantially similar" also refers to nucleic acid fragments 
wherein changes in one or more nucleotide bases do not affect the ability of the 
nucleic acid fragment to mediate alteration of gene expression by antisense or co- 

30 suppression technology. "Substantially similar" also refers to modifications of 

the nucleic acid fragments of the present invention such as deletion or insertion of 
one or more nucleotide bases that do not substantially affect the functional 
properties of the resulting transcript vis-a-vis the ability to mediate alteration of 
gene expression by antisense or co-suppression technology or alteration of the 

35 functional properties of the resulting protein molecule. It is therefore understood 
that the invention encompasses more than the specific exemplary sequences. 

A "substantial portion" refers to an amino acid or nucleotide sequence 
which comprises enough of the amino acid sequence of a polypeptide or the 
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nucleotide sequence of a gene to afford putative identification of that polypeptide 
or gene, either by manual evaluation of the sequence by one skilled in the art, or 
by computer-automated sequence comparison and identification using algorithms 
such as BLAST (Basic Local Alignment Search Tool; Altschul et aL, J. Mol Biol 
5 215:403-410 (1993); see also www.ncbi.nlm.nih.gov/BLAST/). In general, a 
sequence often or more contiguous amino acids or thirty or more nucleotides is 
necessary in order to putatively identify a polypeptide or nucleic acid sequence as 
homologous to a known protein or gene. Moreover, with respect to nucleotide 
sequences, gene-specific oligonucleotide probes comprising 20-30 contiguous 
10 nucleotides may be used in sequence-dependent methods of gene identification 
(e.g., Southern hybridization) and isolation (e.g., in situ hybridization of bacterial 
colonies or bacteriophage plaques). In addition, short oligonucleotides (generally 
12 bases or longer) may be used as amplification primers in PCR in order to 
obtain a particular nucleic acid fragment comprising the primers. Accordingly, a 
1 5 "substantial portion" of a nucleotide sequence comprises enough of the sequence 
to afford specific identification and/or isolation of a nucleic acid fragment 
comprising the sequence. The present specification teaches partial or complete 
amino acid and nucleotide sequences encoding one or more particular plant 
proteins. The skilled artisan, having the benefit of the sequences as reported 

20 herein, may now use all or a substantial portion of the disclosed sequences for the 
purpose known to those skilled in the art. Accordingly, the present invention 
comprises the complete sequences as reported in the accompanying Sequence 
Listing, as well as substantial portions of those sequences as defined above. 

For example, it is well known in the art that antisense suppression and co- 

25 suppression of gene expression may be accomplished using nucleic acid 

fragments representing less than the entire coding region of a gene, and by nucleic 
acid fragments that do not share 100% identity with the gene to be suppressed. 
Moreover, alterations in a gene that result in the production of a chemically 
equivalent amino acid at a given site, but do not effect the functional properties of 

30 the encoded protein, are well known in the art. Thus, a codon for the amino acid 
alanine, a hydrophobic amino acid, may be substituted by a codon encoding 
another less hydrophobic residue, such as glycine, or a more hydrophobic residue, 
such as valine, leucine, or isoleucine. Similarly, changes which result in 
substitution of one negatively charged residue for another, such as aspartic acid 

35 for glutamic acid, or one positively charged residue for another, such as lysine for 
arginine, can also be expected to produce a functionally equivalent product. 
Nucleotide changes which result in alteration of the N-terminal and C-terminal 
portions of the protein molecule would also not be expected to alter the activity of 
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the protein. Each of the proposed modifications is well within the routine skill in 
the art, as is determination of retention of biological activity of the encoded 
products. Moreover, the skilled artisan recognizes that substantially similar 
sequences encompassed by this invention are also defined by their ability to 
5 hybridize, under stringent conditions (0.1 x SSC, 0.1% SDS, 65 °C), with the 
sequences exemplified herein. Preferred substantially similar nucleic acid 
fragments of the present invention are those nucleic acid fragments whose DNA 
sequences are 80% identical to the DNA sequence of the nucleic acid fragments 
reported herein. More preferred nucleic acid fragments are 90% identical to the 
1 0 DNA sequence of the nucleic acid fragments reported herein. Most preferred are 
nucleic acid fragments that are 95% identical to the DNA sequence of the nucleic 
acid fragments reported herein. 

The term "sequence analysis software" refers to any computer algorithm 
or software program that is useful for the analysis of nucleotide or amino acid 
15 sequences. "Sequence analysis software" may be commercially available or 

independently developed. Typical sequence analysis software will include but is 
not limited to the GCG suite of programs (Wisconsin Package Version 9.0, 
Genetics Computer Group (GCG), Madison, WI), BLASTP, BLASTN, BLASTX 
(Altschul et al., J. Mol Biol 215:403-410 (1990), and DNASTAR (DNASTAR, 
20 Inc. 1228 S. Park St. Madison, WI 53715 USA). Within the context of this 

application it will be understood that where sequence analysis software is used for 
analysis, that the results of the analysis will be based on the " default values" of 
the program referenced, unless otherwise specified. As used herein " default 
vales" will mean any set of values or parameters which originally load with the 
25 software when first initialized. 

The term "percent identity" is a relationship between two or more 
polypeptide sequences or two or more polynucleotide sequences, as determined by 
comparing the sequences. In the art, "identity" also means the degree of sequence 
relatedness between polypeptide or polynucleotide sequences, as the case may be, 
30 as determined by the match between strings of such sequences. "Identity" and 
"similarity" can be readily calculated by known methods, including but not 
limited to those described in: Computational Molecular Biology (Lesk, A. M., 
ed.) Oxford University Press, New York (1988); Biocomputing: Informatics and 
Genome Projects (Smith, D. W., ed.) Academic Press, New York (1993); 
35 Computer Analysis of Sequence Data. Part I (Griffin, A. M., and Griffin, H. G., 

eds.) Humana Press, New Jersey (1994); Sequence Analysis in Molecular Biology 
(von Heinje, G., ed.) Academic Press (1987); and Sequence Analysis Primer 
(Gribskov, M and Devereux, J., eds.) Stockton Press, New York (1991). 
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Preferred methods to determine identity are designed to give the largest match 
between the sequences tested. Methods to determine identity and similarity are 
codified in publicly available computer programs. Preferred computer program 
methods to determine identity and similarity between two sequences include, but 
5 are not limited to, the GCG Pileup program found in the GCG program package, 
using the Needleman and Wunsch algorithm with their standard default values of 
gap creation penalty=12 and gap extension penalty=4 (Devereux et al, Nucleic 
Acids Res. 12:387-395 (1984)), BLASTP, BLASTN, and FASTA (Pearson et al 9 
Proc. Natl. Acad. Set USA 85:2444-2448 (1988). The BLASTX program is 
10 publicly available from NCBI and other sources (BLAST Manual . Altschul et aL 9 
Natl. Cent Biotechnol. Inf., NatL Library Med, (NCBI NLM) NIH, Bethesda, Md. 
20894; Altschul et aL, J. MoL Biol 215:403*410 (1990); Altschul et aU IGapped 
BLAST and PSI-BLAST: a new generation of protein database search programs6, 
Nucleic Acids Res. 25:3389-3402 (1 997)). The method to determine percent 

15 identity preferred in the present invention is by the method of DNASTAR protein 
alignment protocol using the Jotun-Hein algorithm (Hein et aL, Methods Enzymol. 
183:626-645 (1990)). Default parameters used for the Jotun-Hein method for 
alignments are: for multiple alignments, gap penalty=l 1, gap length penalty=3; 
for pairwise alignments ktuple^. As an illustration, for a polynucleotide having a 

20 nucleotide sequence with at least 95% identity to a reference nucleotide sequence, 
it is intended that the nucleotide sequence of the polynucleotide is identical to the 
reference sequence except that the polynucleotide sequence may include up to five 
point mutations per each 1 00 nucleotides of the reference nucleotide sequence. In 
other words, to obtain a polynucleotide having a nucleotide sequence at least 95% 

25 identical to a reference nucleotide sequence, up to 5% of the nucleotides in the 
reference sequence may be deleted or substituted with another nucleotide, or a 
number of nucleotides up to 5% of the total nucleotides in the reference sequence 
may be inserted into the reference sequence. These mutations of the reference 
sequence may occur at the 5' or 3' terminal positions of the reference nucleotide 

30 sequence or anywhere between those terminal positions, interspersed either 
individually among nucleotides in the reference sequence or in one or more 
contiguous groups within the reference sequence. Analogously, for a polypeptide 
having an amino acid sequence having at least 95% "identity" to a reference 
amino acid sequence, it is intended that the amino acid sequence of the 

35 polypeptide is identical to the reference sequence except that the polypeptide 

sequence may include up to five amino acid alterations per each 100 amino acids 
of the reference amino acid. In other words, to obtain a polypeptide having an 
amino acid sequence at least 95% identical to a reference amino acid sequence, up 
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to 5% of the amino acid residues in the reference sequence may be deleted or 
substituted with another amino acid, or a number of amino acids up to 5% of the 
total amino acid residues in the reference sequence may be inserted into the 
reference sequence. These alterations of the reference sequence may occur at the 
5 amino or carboxy terminal positions of the reference amino acid sequence or 
anywhere between those terminal positions, interspersed either individually 
among residues in the reference sequence or in one or more contiguous groups 
within the reference sequence. 

"Codon degeneracy" refers to divergence in the genetic code permitting 

1 0 variation of the nucleotide sequence without effecting the amino acid sequence of 
an encoded polypeptide. Accordingly, the present invention relates to any nucleic 
acid fragment that encodes all or a substantial portion of present MFP1 proteins as 
set forth in SEQ ID NO;2, SEQ ID NO:4, SEQ ID NO:20, SEQ ID NO:22 and 
SEQ ID NO:24. The skilled artisan is well aware of the "codon-bias" exhibited 

15 by a specific host cell to use nucleotide codons to specify a given amino acid. 
Therefore, when synthesizing a gene for improved expression in a host cell, it is 
desirable to design the gene such that its frequency of codon usage approaches the 
frequency of preferred codon usage of the host cell. 

The term " complementary" is used to describe the relationship between 

20 nucleotide bases that are hybridizable to one another. Hence with respect to DNA, 
adenosine is complementary to thymine and cytosine is complementary to 
guanine. 

A nucleic acid molecule is "hybridizable" to another nucleic acid 
molecule, such as a cDNA, genomic DNA, or RNA, when a single stranded form 

25 of the nucleic acid molecule can anneal to the other nucleic acid molecule under 
the appropriate conditions of temperature and solution ionic strength. 
Hybridization and washing conditions are well known and exemplified in 
Sambrook, J., Fritsch, E. F. and Maniatis, T. Molecular Cloning: A Laboratory 
Manual, Second Edition, Cold Spring Harbor Laboratory Press, Cold Spring 

30 Harbor (1989), particularly Chapter 1 1 and Table 11.1 therein (entirely 

incorporated herein by reference). The conditions of temperature and ionic 
strength determine the "stringency" of the hybridization. For preliminary 
screening for homologous nucleic acids, low stringency hybridization conditions, 
corresponding to a Tm of 55°, can be used, e.g., 5X SSC, 0.1% SDS, 0.25% milk, 

35 and no formamide; or 30% formamide, 5X SSC, 0.5% SDS. Moderate stringency 
hybridization conditions correspond to a higher Tm, e.g., 40% formamide, with 5X 
or 6XSSC. 
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Hybridization requires that the two nucleic acids contain complementary 
sequences, although depending on the stringency of the hybridization, mismatches 
between bases are possible. The appropriate stringency for hybridizing nucleic 
acids depends on the length of the nucleic acids and the degree of 
5 complementation, variables well known in the art. The greater the degree of 

similarity or homology between two nucleotide sequences, the greater the value of 
Tm for hybrids of nucleic acids having those sequences. The relative stability 
(corresponding to higher Tm) of nucleic acid hybridizations decreases in the 
following order: RNA:RNA, DNA:RNA, DNA:DNA. For hybrids of greater 

10 than 100 nucleotides in length, equations for calculating Tm have been derived 
(see Sambrook et al., supra, 9.50-9.51). For hybridizations with shorter nucleic 
acids, i.e., oligonucleotides, the position of mismatches becomes more important, 
and the length of the oligonucleotide determines its specificity (see Sambrook 
et al., supra, 1 1 .7-1 1 .8). In one embodiment the length for a hybridizable nucleic 

15 acid is at least about 10 nucleotides. Preferably a minimum length for a 

hybridizable nucleic acid is at least about 15 nucleotides; more preferably at least 
about 20 nucleotides; and most preferably the length is at least 30 nucleotides. 
Furthermore, the skilled artisan will recognize that the temperature and wash 
solution salt concentration may be adjusted as necessary according to factors such 

20 as length of the probe. 

"Synthetic genes" can be assembled from oligonucleotide building blocks 
that are chemically synthesized using procedures known to those skilled in the art. 
These building blocks are ligated and annealed to form gene segments which are 
then enzymatically assembled to construct the entire gene. "Chemically 

25 synthesized", as related to a sequence of DNA, means that the component 

nucleotides were assembled in vitro. Manual chemical synthesis of DNA may be 
accomplished using well established procedures, or automated chemical synthesis 
can be performed using one of a number of commercially available machines. 
Accordingly, the genes can be tailored for optimal gene expression based on 

30 optimization of nucleotide sequence to reflect the codon bias of the host cell. The 
skilled artisan appreciates the likelihood of successful gene expression if codon 
usage is biased towards those codons favored by the host. Determining preferred 
codons can be based on a survey of genes derived from the host cell where 
sequence information is available. 

35 "Gene" refers to a nucleic acid fragment that expresses a specific protein, 

including regulatory sequences preceding (5 ! non-coding sequences) and 
following (3* non-coding sequences) the coding sequence. "Native gene" refers to 
a gene as found in nature with its own regulatory sequences. "Chimeric gene" 
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refers to any gene, not a native gene, comprising regulatory and coding sequences 
that are not found together in nature. Accordingly, a chimeric gene may comprise 
regulatory sequences and coding sequences that are derived from different 
sources, or regulatory sequences and coding sequences derived from the same 
5 source, but arranged in a manner different than that found in nature. "Endogenous 
gene" refers to a native gene in its natural location in the genome of an organism. 
A "foreign" gene refers to a gene not normally found in the host organism, but 
which is introduced into the host organism by gene transfer. Foreign genes can 
comprise native genes inserted into a non-native organism, or chimeric genes. A 

10 "transgene" is a gene that has been introduced into the genome by a 
transformation procedure. 

"Coding sequence" refers to a DNA sequence that codes for a specific 
amino acid sequence. "Regulatory sequences'* refer to nucleotide sequences 
located upstream (5' non-coding sequences), within, or downstream (3 1 non- 

15 coding sequences) of a coding sequence, and which influence the transcription, 
RNA processing or stability, or translation of the associated coding sequence. 
Regulatory sequences may include promoters, translation leader sequences, 
introns, and polyadenylation recognition sequences. 

"Promoter" refers to a DNA sequence capable of controlling the 

20 expression of a coding sequence or functional RNA. In general, a coding 

sequence is located 3' to a promoter sequence. The promoter sequence consists of 
proximal and more distal upstream elements, the latter elements often referred to 
as enhancers. Accordingly, an "enhancer" is a DNA sequence which can 
stimulate promoter activity and may be an innate element of the promoter or a 

25 heterologous element inserted to enhance the level or tissue-specificity of a 

promoter. Promoters may be derived in their entirety from a native gene, or be 
composed of different elements derived from different promoters found in nature, 
or even comprise synthetic DNA segments. It is understood by those skilled in 
the art that different promoters may direct the expression of a gene in different 

30 tissues or cell types, or at different stages of development, or in response to 
different environmental conditions. Promoters which cause a gene to be 
expressed in most cell types at most times are commonly referred to as 
"constitutive promoters". New promoters of various types useful in plant cells are 
constantly being discovered; numerous examples may be found in the compilation 

35 by Okamuro and Goldberg, (Biochemistry of Plants 1 5: 1 -82 (1 989)). It is further 
recognized that since in most cases the exact boundaries of regulatory sequences 
have not been completely defined, DNA fragments of different lengths may have 
identical promoter activity. 
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The 'translation leader sequence" refers to a DNA sequence located 
between the promoter sequence of a gene and the coding sequence. The 
translation leader sequence is present in the fully processed mRN A upstream of 
the translation start sequence. The translation leader sequence may affect 
5 processing of the primary transcript to mRNA, mRN A stability or translation 

efficiency. Examples of translation leader sequences have been described (Turner 
et al., Mol Biotech 3:225 (1995)). 

The "3* non-coding sequences" refer to DNA sequences located 
downstream of a coding sequence and include polyadenylation recognition 

10 sequences and other sequences encoding regulatory signals capable of affecting 
mRNA processing or gene expression. The polyadenylation signal is usually 
characterized by affecting the addition of polyadenylic acid tracts to the 3' end of 
the mRNA precursor. The use of different 3' non-coding sequences is exemplified 
by Ingelbrecht et al. {Plant Cell 1:671-680 (1989)). 

1 5 "RNA transcript" refers to the product resulting from RNA polymerase- 

catalyzed transcription of a DNA sequence. When the RNA transcript is a perfect 
complementary copy of the DNA sequence, it is referred to as the primary 
transcript or it may be a RNA sequence derived from posttranscriptional 
processing of the primary transcript and is referred to as the mature RNA. 

20 "Messenger RNA" (mRNA) refers to the RNA that is without introns and that can 
be translated into protein by the cell. "cDNA" refers to a double-stranded DNA 
that is complementary to and derived from mRNA. "Sense" RNA refers to RNA 
transcript that includes the mRNA and so can be translated into protein by the 
cell. "Antisense RNA" refers to a RNA transcript that is complementary to all or 

25 part of a target primary transcript or mRNA and that blocks the expression of a 
target gene (U.S. 5,107,065). The complementarity of an antisense RNA may be 
with any part of the specific gene transcript, i.e., at the 5 ! non-coding sequence, 3 f 
non-coding sequence, introns, or the coding sequence. "Functional RNA" refers 
to antisense RNA, ribozyme RNA, or other RNA that is not translated yet has an 

30 effect on cellular processes. 

The term "operably-linked" refers to the association of nucleic acid 
sequences on a single nucleic acid fragment so that the function of one is affected 
by the other. For example, a promoter is operably-linked with a coding sequence 
when it affects the expression of that coding sequence (i.e., that the coding 

35 sequence is under the transcriptional control of the promoter). Coding sequences 
can be operably-linked to regulatory sequences in sense or antisense orientation. 

The term "expression" refers to the transcription and stable accumulation 
of sense (mRNA) or antisense RNA derived from the nucleic acid fragment of the 
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invention. Expression may also refer to translation of mRNA into a polypeptide. 
"Antisense inhibition" refers to the production of antisense RNA transcripts 
capable of suppressing the expression of the target protein. "Overexpression" 
refers to the production of a gene product in transgenic organisms that exceeds 
5 levels of production in normal or non-transformed organisms. "Co-suppression" 
refers to the production of sense RNA transcripts capable of suppressing the 
expression of identical or substantially similar foreign or endogenous genes 
(U.S. 5,231,020). 

"Altered levels" refers to the production of gene product(s) in organisms 
10 in amounts or proportions that differ from that of normal or non-transformed 
organisms. 

" Mature" protein refers to a post-translationally processed polypeptide; 
i.e., one from which any pre- or propeptides present in the primary translation 
product have been removed. " Precursor" protein refers to the primary product of 

15 translation of mRNA; i.e., with pre- and propeptides still present. Pre- and 
propeptides may be but are not limited to intracellular localization signals. 

A " chloroplast transit peptide" is an amino acid sequence which is 
translated in conjunction with a protein and directs the protein to the chloroplast 
or other plastid types present in the cell in which the protein is made. 

20 " Chloroplast transit sequence" refers to a nucleotide sequence that encodes a 

chloroplast transit peptide. A " signal peptide" is an amino acid sequence which 
is translated in conjunction with a protein and directs the protein to the secretory 
system (Chrispeels, J. J., (1991) Ann. Rev. Plant Phys. Plant Mol. Biol. 42:21-53). 
If the protein is to be directed to a vacuole, a vacuolar targeting signal (supra) can 

25 further be added, or if to the endoplasmic reticulum, an endoplasmic reticulum 
retention signal (supra) may be added. If the protein is to be directed to the 
nucleus, any signal peptide present should be removed and instead a nuclear 
localization signal included (Raikhel (1992) Plant Phys. 700:1627-1632). 

"Transformation" refers to the transfer of a nucleic acid fragment into the 

30 genome of a host organism, resulting in genetically stable inheritance. Host 

organisms containing the transformed nucleic acid fragments are referred to as 
"transgenic" organisms. Examples of methods of plant transformation include 
Agrobacterium-mediated transformation (De Blaere et al., Meth. Enzymol 
143:277 (1987)) and particle-accelerated or "gene gun" transformation technology 

35 (Klein et al., Nature, London 327:70-73 (1987); U.S. 4,945,050). 

Standard recombinant DNA and molecular cloning techniques used herein 
are well known in the art and are described more fully in Sambrook, J., Fritsch, 
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E.F. and Maniatis, T. Molecular Cloning: A Laboratory Manual; Cold Spring 
Harbor Laboratory Press: Cold Spring Harbor, 1989 (hereinafter "Maniatis"). 

Novel MFP1 -binding proteins, have been isolated from tobacco, corn, 
soybean and rice. Comparison of their random cDNA sequences to the GenBank 
5 database using the BLAST and DNASTAR algorithms, well known to those 
skilled in the art, revealed that these proteins have no significant homologies to 
other known proteins, other than MFP1 proteins. The nucleotide sequences of the 
present MFP1 cDNA are provided in SEQ ID NO:l, SEQ ID NO:3, SEQ ID 
NO:ll, SEQIDNO:12, SEQ IDNO:13, SEQ IDNO:14, SEQIDNO:15, SEQ 

10 ID NO:19, SEQ ID NO:21 and SEQ ID NO:23. Other MFP1 genes and proteins 
from other plants can now be identified by comparison of random cDNA 
sequences to the present MFP1 sequences provided herein. 

Comparison of the instant MFP1 base deduced amino acid sequences to 
the only published sequence of this kind (LeMFPl, Meier et al, Plant Cell 

15 8:2105-21 15 (1996); SEQ ID NO:17 and 18) show a variation of homology of 
about 39% identity (rice, SEQ ID NO:24) over a length of 107 amino acids to 
about 77% identity for tobacco (SEQ ID NO:2 and 4) as compared by the 
Jotun-Hein alignment algorithm (Hein et al., Methods Enzymol 1 83:626-645 
(1990)). 

20 Accordingly preferred polypeptides of the instant invention are those plant 

proteins which are at least 77% identical to the amino acid sequence as set forth 
in SEQ ID 17. More preferred amino acid fragments are at least about 80%-90% 
identical to the sequences herein. Most preferred are nucleic acid fragments that 
are at least 95% identical to the amino acid fragments reported herein. Similarly, 

25 preferred nucleic acid sequences are those encoding MFP1 binding proteins and 
which are at least 80% identical to the nucleic acid sequences of reported herein. 
More preferred nucleic acid fragments are at least 90% identical to the sequences 
herein. Most preferred are nucleic acid fragments that are at least 95% identical to 
the nucleic acid fragments reported herein. 

30 Similarly preferred polypeptides are those isolated from corn which are at 

least 40% identical to the polypeptide of SEQ ID NO: 17 over a length of about 
672 amino acids as compare by the Jotun-Hein alignment algorithm (Hein et al., 
supra). Other preferred polypeptides are those isolated from rice which are at 
least 39% identical to the polypeptide of SEQ ID NO: 17 over a length of about 

35 1 07 amino acids as compare by the Jotun-Hein alignment algorithm (Hein et aL, 
supra). Additionally preferred polypeptides are those isolated from soybean 
which are at least 46% identical to the polypeptide of SEQ ID NO: 1 7 over a 
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length of about 388 amino acids as compare by the Jotun-Hein alignment 
algorithm (Hein et al., supra). 

The nucleic acid fragments of the present invention may be used to isolate 
cDNAs and genes encoding a homologous MFP1 proteins from the same or other 
plant species. Isolating homologous genes using sequence-dependent protocols is 
well known in the arti Examples of sequence-dependent protocols include, but 
are not limited to, methods of nucleic acid hybridization and methods of DNA and 
RNA amplification as exemplified by various uses of nucleic acid amplification 
technologies (e.g., polymerase chain reaction (PGR) or ligase chain reaction). 

For example, other MFP1 genes, either as cDNAs or genomic DNAs, 
could be isolated directly by using all or a portion of the present nucleic acid 
fragments as DNA hybridization probes to screen libraries from any desired plant 
using methodology well known to those skilled in the art. Specific 
oligonucleotide probes based upon the present MFP1 sequences can be designed 
and synthesized by methods known in the art (Maniatis, supra). Moreover, the 
entire sequences can be used directly to synthesize DNA probes by methods 
known to the skilled artisan such as random primers, DNA labeling, nick 
translation, or end-labeling techniques, or RNA probes using available in vitro 
transcription systems. In addition, specific primers can be designed and used to 
amplify a part of or full-length of the present sequences. The resulting 
amplification products can be labeled directly during amplification reactions or 
labeled after amplification reactions, and used as probes to isolate full length 
cDNA or genomic fragments under conditions of appropriate stringency. 

In addition, two short segments of the present nucleic acid fragment may 
be used in PCR protocols to amplify longer nucleic acid fragments encoding 
homologous MFP1 genes from DNA or RNA. The polymerase chain reaction 
may also be performed on a library of cloned nucleic acid fragments wherein the 
sequence of one primer is derived from the present nucleic acid fragments, and the 
sequence of the other primer takes advantage of the presence of the polyadenylic 
acid tracts to the 3' end of the mRNA precursor encoding plant MFP1 . 

Alternatively, the second primer sequence may be based upon sequences 
derived from the cloning vector. For example, the skilled artisan can follow the 
RACE protocol (Frohman et al., Proc. Natl. Acad Sci., USA 85:8998 (1988)) to 
generate cDNAs by using PCR to amplify copies of the region between a single 
point in the transcript and the 3 T or 5' end. Primers oriented in the 3 1 and 5' 
directions can be designed from the present sequences. Using commercially 
available 3' RACE or 5* RACE systems (BRL), specific 3' or 5' cDNA fragments 
can be isolated (Ohara et al., Proc. Natl Acad. Set, USA 86:5673 (1989); Loh 
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et al., Science 243:217 (1989)). Products generated by the 3" and 5' RACE 
procedures can be combined to generate full-length cDNAs (Frohman et al., 
Techniques 1:165 (1989)). 

Finally, availability of the present nucleotide and deduced amino acid 
5 sequences facilitates immunological screening of cDNA expression libraries. 

Synthetic peptides representing portions of the present amino acid sequences may 
be synthesized. These peptides can be used to immunize animals to produce 
polyclonal or monoclonal antibodies with specificity for peptides or proteins 
comprising the amino acid sequences. These antibodies can be then be used to 

1 0 screen cDNA expression libraries to isolate full-length cDNA clones of interest 
(Lerneretal.,^4^. Immunol 36:1 (1984); Maniatis, supra). 

The nucleic acid fragments of the present invention may also be used to 
create transgenic plants in which the present MFP1 protein is present at higher or 
lower levels than normal. Alternatively, in some applications, it might be 

1 5 desirable to express the present MFP1 protein in specific plant tissues and/or cell 
types, or during developmental stages in which they would normally not be 
encountered. 

Overexpression of the present MFP1 may be accomplished by first 
constructing a chimeric gene in which the MFP1 coding region is operably-linked 

20 to a promoter capable of directing expression of a gene in the desired tissues at the 
desired stage of development. For reasons of convenience, the chimeric gene may 
comprise promoter sequences and translation leader sequences derived from the 
same genes. 3 f Non-coding sequences encoding transcription termination signals 
must also be provided. The present chimeric genes may also comprise one or 

25 more introns in order to facilitate gene expression. 

Plasmid vectors comprising the present chimeric genes can then be 
constructed. The choice of a plasmid vector depends upon the method that will be 
used to transform host plants. The skilled artisan is well aware of the genetic 
elements that must be present on the plasmid vector in order to successfully 

30 transform, select and propagate host cells containing the chimeric gene. The 

skilled artisan will also recognize that different independent transformation events 
will result in different levels and patterns of expression (Jones et al., EMBO J. 
4:2411-2418 (1985); De Almeida et al., Mol Gen. Genetics 218:78-86 (1989)), 
and thus that multiple events must be screened in order to obtain lines displaying 

35 the desired expression level and pattern. Such screening may be accomplished by 
Southern analysis of DNA, Northern analysis of mRNA expression, Western 
analysis of protein expression, or phenotypic analysis. 
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For some applications it may be useful to direct the MFP1 protein to 
different cellular compartments or to facilitate their secretion from the cell. The 
chimeric genes described above may be further modified by the addition of 
appropriate intracellular or extracellular targeting sequence to their coding 
5 regions. These include chloroplast transit peptides (Keegstra et al., Cell 
56:247-253 (1989), signal sequences that direct proteins to the endoplasmic 
reticulum (Chrispeels et aL.,Ann. Rev. Plant Phys, Plant Mol 42:21-53 (1991), 
and nuclear localization signal (Raikhel et aL, Plant Phys. 100:1627-1632 (1992). 
While the references cited give examples of each of these, the list is not 

10 exhaustive and more targeting signals of utility may be discovered in the future. 

It may also be desirable to reduce or eliminate expression of the MFP1 
genes in plants for some applications. In order to accomplish this, chimeric genes 
designed for antisense or co-suppression of MFP1 can be constructed by linking 
the genes or gene fragments encoding parts of these enzymes to plant promoter 

15 sequences. Thus, chimeric genes designed to express antisense RNA for all or 

part of MFP1 can be constructed by linking the MFP1 genes or gene fragments in 
reverse orientation to plant promoter sequences. The co-suppression or antisense 
chimeric gene constructs could be introduced into plants via well known 
transformation protocols wherein expression of the corresponding endogenous 

20 genes are reduced or eliminated. 

The present MFP1 proteins may be produced in heterologous host cells, 
particularly in the cells of microbial hosts, and can be used to prepare antibodies 
to the proteins by methods well known to those skilled in the art. The antibodies 
would be useful for detecting the present MFP1 protein in situ in cells or in vitro 

25 in cell extracts. Preferred heterologous host cells for production of the present 
MFP1 protein are microbial hosts. Microbial expression systems and expression 
vectors containing regulatory sequences that direct high level expression of 
foreign proteins are well known to those skilled in the art. Any of these could be 
used to construct a chimeric gene for production of the present MFP1 . This 

30 chimeric gene could then be introduced into appropriate microorganisms via 
transformation to provide high level expression of the present MFP1 protein. 

Microbial host cells suitable for the expression of the present MFP1 
proteins include any cell capable of expression of the chimeric genes encoding 
these proteins. Such cells will include both bacteria and fungi including, for 

35 example, the yeasts (e.g., Aspergillus , Saccharomyces, Pichia, Candida, and 

Hansenula), members of the genus Bacillus as well as the enteric bacteria (e.g., 
Escherichia, Salmonella, and Shigella). Methods for the transformation of such 
hosts and the expression of foreign proteins are well known in the. art and 
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examples of suitable protocols may be found in Manual of Methods for General 
Bacteriology (Gerhardt et al., eds., American Society for Microbiology, 
Washington, DC. (1994)) or in Biotechnology: A Textbook of Industrial 
Microbiology , Second Edition, Brock, T. D., Sinauer Associates, Inc., Sunderland, 
5 MA (1989)). 

Vectors or cassettes useful for transforming suitable microbial host cells 
are well known in the art. Typically the vector or cassette contains sequences 
directing transcription and translation of the relevant gene, a selectable marker, 
and sequences allowing autonomous replication or chromosomal integration. 

10 Suitable vectors comprise a region 5' of the gene which harbors transcriptional 
initiation controls and a region 3 1 of the DNA fragment which controls 
transcriptional termination. It is most preferred when both control regions are 
derived from genes homologous to the transformed host cell, although such 
control regions need not be derived from the genes native to the specific species 

1 5 chosen as a production host. 

Initiation control regions or promoters useful to drive expression of the 
genes encoding the MFP1 proteins in the desired host cell are numerous and 
familiar to those skilled in the art. Virtually any promoter capable of driving 
these genes is suitable for the present invention including but not limited to 

20 CYC1, HJS3, GAL1, GAL 10, ADH1, PGK, PHOS, GAPDH, ADC1, TRP1, 
URA3, LEU2, ENO, TPI (useful for expression in Saccharomyces); AOX1 
(useful for expression in Pichia); and lac, trp, 1P L , 1P R , T7, tac, and trc (useful for 
expression in E. coll). Termination control regions may also be derived from 
various genes native to the preferred hosts. Optionally, a termination site may be 

25 unnecessary; however, it is most preferred if included. 

Additionally, the present MFP1 proteins can be used as targets to facilitate 
the design and/or identification of inhibitors of MFP1 that may be useful as 
herbicides or fungicides. This could be achieved either through the rational 
design and synthesis of potent functional inhibitors that result from structural 

30 and/or mechanistic information that is derived from the purified present plant 
proteins, or through random in vitro screening of chemical libraries. It is 
anticipated that significant in vivo inhibition of any of the MFP1 proteins 
described herein may severely cripple cellular metabolism and likely result in 
plant (or fungal) death. 

35 All or a portion of the nucleic acid fragments of the present invention may 

also be used as probes for genetically and physically mapping the genes that they 
are a part of, and as markers for traits linked to expression of the present MFP1 . 
Such information may be useful in plant breeding in order to develop lines with 
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desired phenotypes. For example, the present nucleic acid fragments may be used 
as restriction fragment length polymorphism (RFLP) markers. Southern blots 
(Maniatis, supra) of restriction-digested plant genomic DNA may be probed with 
the nucleic acid fragments of the present invention. The resulting banding 
5 patterns may then be subjected to genetic analyses using computer programs such 
as MapMaker (Lander et al., Genomics 1:174-181 (1987)) in order to construct a 
genetic map. In addition, the nucleic acid fragments of the present invention may 
be used to probe Southern blots containing restriction endonuclease-treated 
genomic DNAs of a set of individuals representing parent and progeny of a 

10 defined genetic cross. Segregation of the DNA polymorphisms is noted and used 
to calculate the position of the present nucleic acid sequence in the genetic map 
previously obtained using this population (Botstein et aL, Am. J. Hum. Genet. 
32:314-331 (1980)). 

The production and use of plant gene-derived probes for use in genetic 

15 mapping is described by Bernatzky et al. {Plant Mol Biol. Reporter 4:37-41 
(1986)). Numerous publications describe genetic mapping of specific cDNA 
clones using the methodology outlined above or variations thereof. For example, 
F2 intercross populations, backcross populations, randomly mated populations, 
near isogenic lines, and other sets of individuals may be used for mapping. Such 

20 methodologies are well known to those skilled in the art. 

Nucleic acid probes derived from the present nucleic acid sequences may 
also be used for physical mapping (i.e., placement of sequences on physical maps; 
see Hoheisel et al., Nonmammalian Genomic Analysis: A Practical Guide , 
pp. 319-346, Academic Press (1996), and references cited therein). 

25 In another embodiment, nucleic acid probes derived from the present 

nucleic acid sequence may be used in direct fluorescence in situ hybridization 
(FISH) mapping. Although current methods of FISH mapping favor use of large 
clones (several to several hundred kb), improvements in sensitivity may allow 
performance of FISH mapping using shorter probes. 

30 A variety of nucleic acid amplification-based methods of genetic and 

physical mapping may be carried out using the present nucleic acid sequences. 
Examples include allele-specific amplification (Kazazian et al., J. Lab. Clin. Med. 
1 14/95-96 (1989)), polymorphism of PCR-amplified fragments (CAPS; Sheffield 
et al., Genomics 16:325-332 (1993)), allele-specific ligation (Landegren et al., 

35 Science 241 :1077-1080 (1988)), nucleotide extension reactions (Sokolov et al., 
Nucleic Acid Res. 18:3671 (1990)), Radiation Hybrid Mapping (Walter et al., 
Nature Genetics 7:22-28 (1997)) and Happy Mapping (Dear et al. 9 Nucleic Acid 
Res. 17:6795-6807 (1989)). For these methods, the sequence of anucleic acid 
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fragment is used to design and produce primer pairs for use in the amplification 
reaction or in primer extension reactions. The design of such primers is well 
known to those skilled in the art. In methods using PCR-based genetic mapping, 
it may be necessary to identify DNA sequence differences between the parents of 
5 the mapping cross in the region corresponding to the present nucleic acid 
sequence. This, however, is generally not necessary for mapping methods. 

Loss of function-mutant phenotypes may be identified for the present cDNA 
clones either by targeted gene disruption protocols or by identifying specific 
mutants for these genes contained in a maize population carrying mutations in all 

10 possible genes (Ballinger et al., Proc. Natl Acad. Sci. USA 86:9402 (1989); Koes 
et al., Proc. Natl Acad. Sci. USA 92:8149 (1995); Bensen et al., Plant Cell 7:75 
(1995)). The latter approach may be accomplished in two ways. First, short 
segments of the present nucleic acid fragments may be used in polymerase chain 
reaction protocols in conjunction with a mutation tag sequence primer on DNAs 

1 5 prepared from a population of plants in which Mutator transposons or some other 
mutation-causing DNA element has been introduced (see Bensen, supra). The 
amplification of a specific DNA fragment with these primers indicates the 
insertion of the mutation tag element in or near the plant gene encoding the MFP1 
protein. Alternatively, the present nucleic acid fragment may be used as a 

20 hybridization probe against PGR amplification products generated from the 

mutation population using the mutation tag sequence primer in conjunction with 
an arbitrary genomic site primer, such as that for a restriction enzyme site- 
anchored synthetic adaptor. With either method, a plant containing a mutation in 
the endogenous gene encoding a MFP1 protein can be identified and obtained. 

25 This mutant plant can then be used to determine or confirm the natural function of 
the MFP1 gene product. 

The present invention is further defined in the following Examples, in 
which all parts and percentages are by weight and degrees are Celsius, unless 
otherwise stated. It should be understood that these Examples, while indicating 

30 preferred embodiments of the invention, are given by way of illustration only. 

From the above discussion and these Examples, one skilled in the art can ascertain 
the essential characteristics of this invention, and without departing from the spirit 
and scope thereof, can make various changes and modifications of the invention to 
adapt it to various usage and conditions. 

35 EXAMPLES 
GENERAL METHODS 

Standard recombinant DNA and molecular cloning techniques used here 
are well known in the art and are described by Sambrook et al. (1989), J., Fritsch, 
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E. F. and Maniatis, T. Molecular Cloning: A Laboratory Manual. Cold Spring 
Harbor Laboratory Press, Cold Spring Harbor, 1989 (hereinafter "Maniatis"); and 
by T. J. Silhavy, M. L. Bennan, and L. W. Enquist, Experiments with Gene 
Fusions , Cold Spring Harbor Laboratory Press, Cold Spring, N.Y. (1984) and by 
5 Ausubel et al., Current Protocols in Molecular Biology , pub. by Greene 
Publishing Assoc. and Wiley-Interscience (1987). 

Nucleotide and amino acid percent identity and similarity comparisons 
were made using the DNASTAR suite of programs, applying default parameters 
unless indicated otherwise. 

10 The meaning of abbreviations is as follows: "sec" means second(s), 

"min"means minute(s), "h" means hour(s), "d" means day(s), "jjL" means 
microliter, "mL" means milliliters, "L" means liters, "mM" means millimolar, 
"M" means molar, "mmol" means millimole(s). 
Plant material and growth conditions : 

15 Tobacco, tomato, soybean, rice, com, wheat and Arabidopsis thaliana 

were grown in soil in a growth chamber with a 12 h 24 °C light cycle followed by 
a 12 h, 20 °C dark cycle. 

EXAMPLE 1 
Isolation Of Total Protein 
20 Total protein extracts were prepared from leaf tissues. 100 mg aliquots of 

tissue were ground to a fine powder with mortar and pestle in liquid nitrogen, 
resuspended in 0.5 mL extraction buffer (62.5 mM Tris-Cl, pH 6.8, 20% glycerol, 
4% SDS, and 1.4 M (5-mercaptoethanol) and incubated at 70 °C for 10 min. The 
debri was removed by centrifugation at 15,000 rpm for 10 min at 4 °C. The 
25 supernatants were removed to new tubes, frozen in liquid nitrogen, and stored at 
-80 °C. 

EXAMPLE 2 

Protein Expression. Purification And Antibody Production 
pRSETC-MFP 1 -EcoRI (containing the coiled-coil domain) and 

30 pRSETA-MFPl -Hindi (containing the DNA binding domain), the expression 
vectors for E-196 (SEQ ID NO:6) coded by SEQ ID NO:5 and H-207 (SEQ ID 
NO:8) coded by SEQ ID NO:7 fragments, respectively, have been described 
previously (Meier et aL 9 Plant Cell 8:2105-21 15 (1996)). Figure 1 shows a 
representation of the subfragments E-196 and H-207 that were expressed in 

35 Escherichia coli. Filled bars indicate cc-helical regions, open bars indicate 
hydrophobic domains. The shaded box marks the DNA-binding domain. 
Numbers indicate the position of the first and last amino acid of each subfragment. 
Expression of recombinant fusion proteins containing an N-terminal 6-histidine 
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tag fused to the protein subfragments E-196 (SEQ ID NO:6) and H-207 (SEQ ID 
NO:8) was induced by isopropyl-D-thiogalactoside in Escherichia coli BL21 cells 
according to the Qiagen protein expression manual (Qiagen, Chatsworth, CA). 
The amount of fusion protein present in the different total E. coli protein extracts 
5 was determined by immunoblotting (Maniatis) with a monoclonal antibody 
directed against the T7 tag (Novagen, Madison, Wisconsin). The expressed 
proteins were purified by nickel-affinity chromatography (Qiagen, Chatsworth, 
CA), followed by SDS PAGE. The bands corresponding to the fusion proteins 
(-1 mg each) were excised from the gel, ground and used to raise two rabbit 

10 antisera (a288 against E-196 (SEQ ID NO:6) and aR50 against H-207 (SEQ ID 
NO: 8)). Polyclonal antibodies were produced in rabbits by Eurogentech, 
Belgium, using the company's standard immunization protocols. The a288 
antibody has been described previously (Meier et al., Plant Cell 8:2105-21 15 
(1996)). 

15 EXAMPLE 3 

Immuno-detection Of MFP1 Related Proteins 
In A Variety Of Higher Plant Species 
A 1 :3000 dilution of a288 or aR50 antiserum, and a 1 :5000 dilution of 
horseradish peroxidase-coupled anti-rabbit secondary antibody (Amersham, 
20 Buckinghamshire, England) were used to perform immunoblot analyses 

(Maniatis). Enhanced chemiluminescence detection was performed using an ECL 
detection kit as described by the manufacturer (Amersham Buckinghamshire, 
England). 

a288 and aR50 polyclonal antibodies were then used to detect proteins 
25 with antigenic similarity to MFP1 in other plant species (Figure 2). Total protein 
extracts were prepared from mature leaf tissues of tomato (Lycopersicon 
esculentum L.\ tobacco (Nicotiana tabacum L.\ Arabidopsis thaliana, soybean 
{Glycine max L.), rice (Oryza sativa L.\ wheat (Triticum aestivum L.), and com 
(Zea mays L.) as described above. Equal amounts of total protein, as determined 
30 by Coomassie Brilliant Blue staining of a replica protein gel, were probed in 

immunoblot experiments with aR50 (Figure 2A) and a288 (Figure 2B) polyclonal 
antibodies. The arrow indicates the position of the MFPl-like proteins of 
approximately equal size in both panels. The position of molecular weight 
markers is indicated. aR50 antibody detects a single protein of slightly variable 
35 size in all species tested. A second band of higher molecular weight (asterick in 
Figure 2A) was only occasionally observed in tomato and tobacco extracts 
(tobacco not shown in Figures 2A-B) and might represent an aggregate of MFP1. 
In contrast, a288 antibody only detected a protein of about 80 kD in tomato and 
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tobacco extracts, suggesting that the DNA-binding domain of MFP1 is more 
highly conserved than the part of the coiled-coil domain present on fragment 



E-196. 



5 



Together, these data indicate that a protein of similar size, containing a 
related DNA-binding domain, is conserved among higher plant species and that 
the highest degree of similarity to tomato (LeMFPl) among the plants 
investigated can be expected in tobacco. 
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EXAMPLE 4 

Cloning and Characterization Of Several Tobacco MFP1 cDNAs 
Corresponding To Two Tobacco MFP1 Proteins 



Example 4 describes the cloning and characterization of two, distinct 
MFP1 proteins from tobacco. 
Isolation Of cDNA Bv Hybridization 

The cDNAs encoding tobacco MFP1 were cloned and characterized. An 

15 oligo-dT-primed lambda-ZAP cDNA library made from Nicotiana tabacum var. 
SRI leaf tissue was purchased from Stratagene (La Jolla, CA). The library was 
screened in a DNA-hybridization screen according to Maniatis with a 1.6 kb 
partial cDNA clone representing the 3 f 2/3 of the tomato homolog of MFP1, 
LeMPFl cDNA (p7 2; SEQ ID NO:9) (Meier et al. 9 Plant Cell 8:2105-21 15 

20 (1996))ora 1.0 kb 5 1 partial LeMFPl cDN A clone (pi -3; SEQ ID NO: 10) (Meier 
et al., Plant Cell 8:2105-21 15 (1996)). Hybridization conditions were 5 x 
Denhards (Maniatis), 5 x SSPE (Maniatis), 5% SDS, 20 jag/mL salmon sperm 
DNA at 55 °C. Washes were performed at high stringency (0.1 x SSC, 0.1 % SDS 
at 65 °C). Positive plaques were detected by autoradiography and carried through 

25 two subsequent rounds of purification, as described above. In vivo excision of 
positive phage was performed according to the manufacturer's protocol 
(Stratagene, La Jolla, CA). 
Sequencing 



30 among approximately 600,000 pfus. After in vivo excision, sequence analysis of 
the two excised cDNAs (T6 (SEQ ID NO:l 1) and Tl (SEQ ID NO: 12)) showed 
that they represented 1 103 bp and 912 bp C-terminal MFP1 sequence, respectively 
(Figure 3). DNA sequencing was carried out using an ABI Model 377 Sequencer 
(Perkin Elmer- ABI, Foster City, CA). Sequencing reactions utilized fluorescent 

35 sequencing techniques with d-rhodamine and Big Dye terminator chemistry 
(Perkin Elmer- ABI, Foster City, CA) and were performed according to the 
standard protocols. The sequence identity between the two 3* fragments Tl (SEQ 



In the first screen two positive plaque-forming units (pfus) were detected 
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ID NO: 12) and T6 (SEQ ID NO:l 1) is 91.5%, suggesting the presence of two 
MFP1 genes in Nicotiana tabacum. 

In a second round, the tobacco cDNA library was screened with a 1 .0 kb 5 1 
fragment of the LeMFPl cDNA (pi -3; SEQ ID NO: 10) (Meier et al., Plant Cell 
5 8:2105-21 15(1 996)). Two positive pfus were detected among approximately 
600,000 pfus. Sequencing of the excised cDNAs (T2 (SEQ ID NO: 13) and T3 
(SEQ ID NO: 14)) showed that they represented partial cDNAs, overlapping with 
Tl (SEQ ID NO: 12) and T6 (SEQ ID NO:l 1) (Figure 3). Initial sequence analysis 
of the T2 (SEQ ID NO: 13) and T6 (SEQ ID NO:l 1) cDNAs showed that they 

10 shared 445 bp of identical overlapping sequence. It was concluded the Tl (SEQ 
ID NO: 1 3) and T6 (SEQ ID NO: 1 1 ) cDNAs represent different portions of the 
same gene. The overlap of T3 (SEQ ID NO: 14) and Tl (SEQ ID NO: 12) is only 
70 bp, and within this area, there is only a single base pair difference between T6 
(SEQ ID NO:l 1) and Tl (SEQ ID NO:12). 

1 5 In order to confirm that T3 (SEQ ID NO: 14) and Tl (SEQ ID NO: 12) 

were derived from the same gene, PCR primers (SEQ ID NO:25 and SEQ ID 
NO:26) were designed from the T3 (SEQ ID NO: 14) and Tl (SEQ ID NO: 12) 
sequences, that would allow the amplification of a 397 bp fragment, from a 
Nicotiana tabacum lambda ZAP cDNA library, overlapping both cDNAs. PCR 

20 reactions were carried out in a Perkin Elmer 9600 thermocycler. (Perkin Elmer, 
Foster City, CA). The thermocycler was programmed as follows: 2 min 96 °C 
denaturation cycle, was followed by 30 cycles of 94 °C, 45 sec; 55 °C, 45 sec; 
72 °C, 90 sec, and ended with an 8 min 72 °C final extention cycle. 
Cloning 

25 Using restriction sites added to the primers, the PCR fragment was 

subsequently cloned into the Xbal/BamHI sites of pSK+ (Stratagene, La Jolla, 
CA). The sequence of the fragment PCR1 (SEQ ID NO:l 5) (Figure 3) was found 
to be 100% identical with both Tl (SEQ ID NO:12) and T3 (SEQ ID NO:14), 
confirming that these two cDNA fragments are derived from the same gene. 

30 Figure 3 shows a schematic structure of the partial tobacco MFP1 cDNAs. 

T3, Tl and PCR1, shown as open boxes, represent overlapping fragments of the 
same gene (NtMFPl-1). T2 and T6, shown as filled boxes, represent overlapping 
fragments of a second gene (NtMFPl-2). The fragment used as a probe for the 
Southern blot (Figure 5) is indicated. 

35 Confirmation Of The Presence Of Two Genes 

The divergence between the two tobacco MFP1 cDNAs indicated that they 
were derived from two different genes. It has been previously shown that a single 
gene (SEQ ID NO: 16) codes for MFP1 (SEQ ID NO: 17) in tomato (LeMFPl) 



27 



WO 00/61615 PCT/USOO/09723 
(Meier et aL, Plant Cell 8:2105-21 15 (1996)). Applicants have additionally found 
that AtMFPl is a single gene in Arabidopsis (data not shown). Based on these 
findings it was necessary to confirm whether MFP1 as also a single-copy gene in 
the two diploid progenitors of amphidiploid Nicotiana tabacum^ 
5 N. tomentosiformis and N sylvestris. In order to confirm this hypothesis the 
following procedure was applied. 

For a Southern blot of Nicotiana tabacum genomic DNA, 20 jag aliquots of 
DNA were digested with various restriction enzymes, run out on 0.8% agarose gel, 
and were transferred to Immobilon N hydrophobic filters (Millipore, Bedford, 
10 MA). Hybridization conditions were essentially as described by Maniatis. The 
probe (SEQ ID NO: 18) was prepared by purification of a 391 bp Xhol/Spel 
fragment from the Nicotiana tabacum clone T3, as described above. The 
hybridization temperature was 65 °C. The probe (SEQ ID NQ:18), shown in 
Figure 3, was labelled with 32 P by random prime method according to the 

1 5 manufacturers instructions (BRL, Gaitherburg, MD). Washes were performed at 
high stringency (0.1 x SSC, 0.1 % SDS at 65 °C). 

In the region overlapping the probe, NtMFPl-1 (SEQ ID NO:l) contained a 
single Xbal site, whereas NtMFPl-2 (SEQ ID NO:3) contains no Xbal site. 
Neither of the two cDNAs contained an EcoRI site. Figure 5 shows the genomic 

20 organization of tobacco MFP1. Tobacco genomic DNA was digested with the 
indicated restriction enzymes, separated by agarose gel electrophoreses and 
hybridized in a genomic Southern blot with 391 bp Xho/Spe fragment from the 
Nicotiana tabacum cDNA clone T3 (shown in Figure 3). Abbeviations used are as 
follows: E, EcoRI; X, Xbal; E/X, EcoRI/Xbal; S, Sspl; S/X, SspI/XbaL The 

25 position of DNA size markers is indicated on the right. Two fragments were 

detected in the lane containing an EcoRI digest (approximately 3.7 kb and 2.7 kb) 
and three were seen in the lane containing an Xbal digest (approximately 8.0 kb, 
7.5 kb and 5.0 kb). In the lane containing the EcoRI/Xbal double digest, the 3.7 kb 
EcoRI fragment appears to be cleaved by Xbal, leading to two smaller fragments of 

30 approximately 1.6 and 0.8 kb. This pattern is consistent with the presence of two 
genes, one of which contains an Xbal site in the region hybridizing to the probe. In 
addition, Sspl and Sspl/Xbal digests were analyzed. Again, one of the two bands 
detected in the Sspl digest is cleaved in the Sspl/Xbal double digest. The observed 
patterns are all consistent with the presence of two genes in the Nicotiana tabacum 

35 genome, represented by the two isolated cDNAs. These data indicate that, at least 
in tomato, tobacco and Arabidopsis, (data not shown), MFP1 is encoded by a 
single gene per diploid genome. 
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In summary, two distinct NtMFPl cDNAS were isolated from tobacco and 
named NtMFPl-1 (SEQ ID NO:l) (containing T3 and Tl) and NtMFPl-2 (SEQ 
ID NO:3) (containing T2 and T6). NtMFPl-1 (SEQ ID NO:l) is a full-length 
cDNA coding for a protein of 722 amino acids (SEQ ID NO:2). NtMFPl-1 (SEQ 
5 ID NO:l) and NtMFPl -2 (SEQ ID NO:3) have 77.0% and 78.9% identity to 

LeMFPl (SEQ ID NO:16) on DNA level, respectively. The identity between the 
two tobacco sequences is 91.5%. NtMFPl-1 (SEQ ID NO:2) contains an open 
reading frame of 721 amino acids. It contains a short 69 bp 5' non-coding region 
preceding the ATG start codon. NtMFPl -2 (SEQ ID NO:4) contains an open 
10 reading frame of 398 amino acids and is not a full-length cDNA. 

Table 1 lists the DNASTAR and BLAST comparison of Nt-MFPl-1 and 
Nt-MFPl-2 with a suit of public databases as well as the literature sequence for 
tomato (SEQ ID NO: 16 and 17) and the MFP1 sequence isolated from 
Arabidopsis (http ://genome www. standard. edu/ Arabidopsis/) . 

15 
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EXAMPLE 5 

Primary And Secondav Structure Analysis Of NtMFPl-1 And NtMFPl-2 
Due to the small number of MFPl-like proteins discovered to date, it was 
advisable to confirm the identity of the present proteins through an analysis of 
5 secondary protein structure. Comparisons of the Nt-MFPl-1 and Nt-MFPl-2 

proteins were made with the secondary structure of LeMFPl isolated from tomato 
and AtMFPl isolated from Arabidopsis. 

The Arabidopsis genomic DNA sequence was accessed through the 
Arabidopsis thaliana Database (http://genomewww.standard.edu/Arabidopsis/). 
10 The deduced protein sequences of the MFP1 proteins were determined and 

compared using DNASTAR Lasergene software (DNASTAR, Inc., Madison, WI). 
Figure 4 A shows the percent identical amino acids in pairwise comparisons of the 
four MFP1 proteins. 

Based on the amino acid sequence identity NtMFPl-1 (SEQ ID NO:2) and 
15 NtMFPl-2 (SEQ ID NO:4) are most closely related. LeMFPl (SEQ ID NO:17) is 
more closely related to the two tobacco MFPls (NtMFPl-1 (SEQ ID NO:2) and 
NtMFPl-2 (SEQ ID NO:4)) (76% overall sequence identity) than to AtMFPl 
(41% overall identity) reflecting the closer relationship of the two solanaeceous 
species. 

20 Figure 4B shows the hydrophilicity and secondary structure analysis of 

NtMFPl-1 (SEQ ID NO:2), LeMFPl (SEQ ID NO:17) and AtMFPl. The 
secondary structures of the proteins, hydrophilicity, a-helical, and coiled-coil 
regions were analyzed using DNASTAR PROTEAN software. AH indicates 
a-helical, CC indicates coiled-coil and HP indicates hydrophilicity plot. The 

25 hydrophobic domains are marked with open boxes. Like LeMFPl (SEQ ID 
NO: 17), NtMFPl and AtMFPl contain an extended a-helical, coiled-coil like 
domain and a shorter N-terminal, non-a-helical region that contains two 
hydrophobic domains. These structural features are extremely well conserved, 
despite a relatively low degree of identity on amino acid level in some areas. This 

30 is consistent with the more structural conservation of the positioning of polar and 
non-polar amino acids that is known from other filament-like coiled-coil proteins 
such as the nuclear lamins (McKeon et al., Nature 3 1 9:463-468 (1986)). The 
distance between the first and second hydrophobic domains is very similar in all 
three proteins (29 AA for tomato, 31 AA for tobacco, and 33 AA for Arabidopsis 

35 MFP1), indicating a functional relevance of the spacing between the two 

hydrophobic domains. The length of the N-terrninal domain preceding the first 
hydrophobic domain varies between 56 AA for tomato, 61 AA for tobacco, and 
72 AA for Arabidopsis MFP1 . The common feature of this domain in all three 
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proteins is a relatively high content of serine and threonine residues (27% to 
28%). 

EXAMPLE 6 
Composition Of cDNA Libraries And Identification 
5 Of cDNA Clones From Other Plant Species Encoding Homologs Of MFP1 

cDNA libraries representing mRNAs from soybean or corn tissues were 
prepared. The characteristics of the libraries are described below in Table 2. 



cDNA Libraries from Plants 



Table 2 

Library 



Tissue 



10 



15 



20 



25 



Soybean {Glycine max) 



Corn (Zea mays) 

Rice (Oryza sativa Z., 
Nipponbare) 



src3c.pk004.ml 

p0118.chsab48r 
rcaln.pk022.al 1 



8 day old root tissue 
inoculated with eggs of 
nematode 

stem tissue, night 
harvested 

callus normalized 



30 



Soybean MFP1 : 

A soybean MFP1 cDNA was identified based on primary and secondary 
structure analysis. This sequence, from clone src3c.pk004.ini, came from a 
library prepared from 8 day old root tissue inoculated with eggs of cyst nematode 
for four days. This sequence contains 1 164 base pairs of DNA (SEQ ID NO: 19) 
encoding 388 amino acids (SEQ ID NO:20). 

Comparison of this partial soybean MFP1 sequence (SEQ ID NO:20) with 
the sequences from tomato (LeMFPl; SEQ ID NO: 17), tobacco (NtMFPl-1; SEQ 
ID NO:2), and Arabidopsis (AtMFPl) shows it to be 46.1, 45.9, and 40.7% 
identical to these sequences, respectively. In addition, secondary structure 
analysis of the partial soybean MFP1 (GmMFPl; SEQ ID NO:20) coded by SEQ 
ID NO: 19 shows that it contains an extended ot-helical, coiled-coil like domain as 
do the other MFP1 protein sequences. Results of a Southern blot experiment (not 
shown) suggest that the soybean MFP1 is encoded by a single copy gene. 
Corn MFP1 : 

A corn EST sequence was identified as an MFP1 homolog from clone 
pOl 1 8.chsab48r. Secondary structure analysis of the com MFP1 protein (SEQ ID 
NO:22) coded by SEQ ID NO:21 shows that it contains an extended a-helical, 
coiled-coil like domain as well as one of the hydrophobic domains at the 
N-tenninus. Both features are indicative of MFP1 proteins (see Figure 4B). 
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Rice MFP1 : 

A rice EST, from clone rcaln.pk022.al 1, was isolated which codes for an 
MFP1 protein. The identity of the rice EST was based on its high degree of 
identity to the corn MFP1 sequence (68%). This clone covers the C-terminal 
5 region that is most highly conserved between all MFP1 proteins identified. The 
rice MFP1 sequence (SEQ ID NO:23) codes for SEQ ID NO:24. 

Table 3 lists the DNASTAR and BLAST comparison of the MFP1 
sequences isloated from corn, soybean and rice with a suit of public databases as 
well as the literature sequence for tomato (SEQ ID NO: 16 and 17) and the 
] 0 sequence isolated from Arabidopsis 

(http://genomewww.standard.edu/Arabidopsis/). 
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SEQUENCE LISTING 



<110> E- I- du Pont de Nemours and Company 

<120> Homologs of MAR-binding Filament-like protein 1 (MFP1) 

<130> BC1003 PCT 

<140> 
<141> 

<150> 60/128,900 
<151> 1999-04-12 

<160> 26 

<170> Microsoft Office 97 

<210> 1 
<211> 2168 
<212> DNA 

<213> Nicotiana tabacum 
<400> 1 

atggggagtt cttgttttcc ccaatctcca ctctctcatt ctctcttttc ttcttcatca 60 
atatcttctt cccaatttac acccttgctt ttttccccaa gaaatgcgca aaaatgtaaa 120 
aagaaaatgc cagctatggc atgtatacac tcggagaatc aaaaggaaag cgaattctgc 180 
agcagaagaa cgattctttt cgtgggtttc tctgttcttc cacttctcag cttgagggca 240 
aatgcttttg aaggcttgtc agtagattct caagtaaaag cacagccgca gaaagaggag 300 
acagagcaaa caatccaagg aaatgcagag aatcccttct tttctctact taatggactt 360 
ggagtttttg gttcaggcgt gcttggttct ctttatgcct tggctcgaaa cgagaaggcc 420 
gtttctgatg caaccattga atctatgaaa aataagctga aggagaaaga agccacattc 4 80 
gtttcatgga gaagaaattc cagtctgagc tgctgaacga aagggatata cgaaataatc 540 
aacttaagag ggcaggcgaa gaacggcaag ctctggttaa ccaattgaat tcagcaaaga 600 
gtacagtaac taaccttggt caggagctgc aaaaagaaaa acgaattgct gaagagctca 660 
tagttcagat cgagggcctt caaaataacc tcatgcagat gaaggaggat aagaaaaaat 720 
tgcaggagga gcttaaagag aagcttgatt tgatacaagt tctgcaagaa aagataactt 780 
tacttactac agagatcaaa gataaagagg catctcttca gagtacaacc tctaaactag 840 
ctgaaaaaga atcagaggta gataaattga gctcaatgta tcaggaatcc caggatcagc 900 
tgatgaattt gacttcagaa atcaaagaac ttaaagtcga agtccagaaa agagagagag 960 
aactagagtt gaaacgtgaa tcagaagaca accttaatgt gcgattaaat tctttgctcg 1020 
ttgagagaga tgaatctaaa aaagagcttg atgctattca aaaggaatac agcgagttca 1080 
agtccatttc agagaagaaa gtggcttctg atgccaagct gttgggggaa caagaaaaga 1140 
gactacacca gctcgaggaa caacttggca ctgcctcaga tgaagtacgc aaaaataatg 1200 
tgctaatcgc tgatctgact caagaaaaag aaaacttaag gagaatgctg gacgctgagc 1260 
tggaaaacat aagcaagttg aagctagagg tccaggttac tcaggaaact cttgagaaat 1320 
ctagaagtga tgcttctgat atagcacaac aactacagca gtcgaggcat ctttgctcta 1380 
agcttgaagc tgaggtttct aaacttcaga tggaattgga ggaaacaaga acatcattac 14 40 
ggaggaacat tgatgagaca aaacgtggtg cagagctctt agctgcggag ctgaccacta 1500 
ctagggagct tctaaagaaa acaaatgaag aaatgcacac tatgtctcat gaactagcgg 1560 
ctgttactga aaattgtgat aacttacaga cggagctagt tgatgtctac aagaaagcag 1620 
aacgtgctgc tgatgaactg aaacaagaaa agaatattgt cgtgacactg gagaaagagc 1680 
taacattttt ggaggctcaa attacaagag agaaagagtc acggaagaat ctggaagaag 1740 
agctggaaag ggctacggaa tcacttgatg agatgaaccg aaatgctttt gcacttgcaa 1800 
aggagcttga gcttgctaat tctcatattt ctagcctcga ggatgagaga gaagtgctcc 1860 
aaaagtctgt ttctgagcag aaacaaattt ctcaagaatc ccgagaaaac cttgaagatg 1920 
cccatagcct ggtaatgaaa cttggcaagg aacgcgagag tctggagaag agagcaaaga 1980 
aattggaaga tgaaatggca tcagcaaaag gtgagatttt gcggctgcgg acccaagtaa 2040 
attcggtaaa agctcctgtt aacaatgagg aaaaagttga agctggggaa aaggcagctg 2100 
taacagtgaa gagaaccagg aggaggaaga ctgctactca gcctgcttct cagcaagaaa 2160 
gctcatag 2168 

<210> 2 
<211> 721 



1 



WO 00/61615 PCT/US00/09723 

<212> PRT 

<213> Nicotiana tabacum 
<400> 2 

Met Glv Ser Ser Cys Phe Pro Gin Ser Pro Leu Ser His Ser Leu Phe 
1 5 10 15 

Ser Ser Ser Ser lie Ser Ser Ser Gin Phe Thr Pro Leu Leu Phe Ser 
20 25 30 

Pro Arg Asn Ala Gin Lys Cys Lys Lys Lys Met Pro Ala Met Ala Cys 
35 40 45 

lie His Ser Glu Asn Gin Lys Glu Ser Glu Phe Cys Ser Arg Arg Thr 
50 55 60 

lie Leu Phe Val Gly Phe Ser Val Leu Pro Leu Leu Ser Leu Arg Ala 
65 70 75 80 

Asn Ala Phe Glu Gly Leu Ser Val Asp Ser Gin Val Lys Ala Gin Pro 

85 90 95 

Gin Lys Glu Glu Thr Glu Gin Thr lie Gin Gly Asn Ala Glu Asn Pro 
100 105 110 

Phe Phe Ser Leu Leu Asn Gly Leu Gly Val Phe Gly Ser Gly Val Leu 
115 120 125 

Gly Ser Leu Tyr Ala Leu Ala Arg Asn Glu Lys Ala Val, Ser Asp Ala 
130 135 140 

Thr lie Glu Ser Met Lys Asn Lys Leu Lys Glu Lys Glu Ala Thr Phe 
145 150 155 160 

Val Ser Met Glu Lys Lys Phe Gin Ser Glu Leu Leu Asn Glu Arg Asp 
165 170 175 

lie Arg Asn Asn Gin Leu Lys Arg Ala .Gly Glu Glu Arg Gin Ala Leu 
180 185 190 

Val Asn Gin Leu Asn Ser Ala Lys Ser Thr Val Thr Asn Leu Gly Gin 
195 200 205 

Glu Leu Gin Lys Glu Lys Arg lie Ala Glu Glu Leu lie Val Gin lie 
210 215 220 

Glu Gly Leu Gin Asn Asn Leu Met Gin Met Lys Glu Asp Lys Lys Lys 
225 230 235 240 

Leu Gin Glu Glu Leu Lys Glu Lys Leu Asp Leu lie Gin Val Leu Gin 
245 250 255 

Glu Lys lie Thr Leu Leu Thr Thr Glu lie Lys Asp Lys Glu Ala Ser 
260 265 270 

Leu Gin Ser Thr Thr Ser Lys Leu Ala Glu Lys Glu Ser Glu Val Asp 
275 ' 280 285 

Lys Leu Ser Ser Met Tyr Gin Glu Ser Gin Asp Gin Leu Met Asn Leu 
290 295 300 

Thr Ser Glu lie Lys Glu Leu Lys Val Glu Val Gin Lys Arg Glu Arg 
305 310 315 320 
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Glu Leu Glu Leu Lys Arg Glu Ser Glu Asp Asn Leu Asn Val Arg Leu 
325 330 335 

Asn Ser Leu Leu Val Glu Arg Asp Glu Ser Lys Lys Glu Leu Asp Ala 
340 345 350 

lie Gin Lys Glu Tyr Ser Glu Phe Lys Ser lie Ser Glu Lys Lys Val 
355 360 365 

Ala Ser Asp Ala Lys Leu Leu Gly Glu Gin Glu Lys Arg Leu His Gin 
370 375 380 

Leu Glu Glu Gin Leu Gly Thr Ala Ser Asp Glu Val Arg Lys Asn Asn 
385 390 395 400 

Val Leu lie Ala Asp Leu Thr Gin Glu Lys Glu Asn Leu Arg Arg Met 
405 410 415 

Leu Asp Ala Glu Leu Glu Asn lie Ser Lys Leu Lys Leu Glu Val Gin 
420 425 430 

Val Thr Gin Glu Thr Leu Glu Lys Ser Arg Ser Asp Ala Ser Asp He 
435 440 445 

Ala Gin Gin Leu Gin Gin Ser Arg His Leu Cys Ser Lys Leu Glu Ala 
450 455 460 

Glu Val Ser Lys Leu Gin Met Glu Leu Glu Glu Thr Arg Thr Ser Leu 
465 470 475 480 

Arg Arg Asn He Asp Glu Thr Lys Arg Gly Ala Glu Leu Leu Ala Ala 
485 490 495 

Glu Leu Thr Thr Thr Arg Glu Leu Leu Lys Lys Thr Asn Glu Glu Met 
500 505 510 

His Thr Met Ser His Glu Leu Ala Ala Val Thr Glu Asn Cys Asp Asn 
515 520 525 

Leu Gin Thr Glu Leu Val Asp Val Tyr Lys Lys Ala Glu Arg Ala Ala 
530 535 540 

Asp Glu Leu Lys Gin Glu Lys Asn He Val Val Thr Leu Glu Lys Glu 
545 550 555 560 

Leu Thr Phe Leu Glu Ala Gin He Thr Arg Glu Lys Glu Ser Arg Lys 
565 570 575 

Asn Leu Glu Glu Glu Leu Glu Arg Ala Thr Glu Ser Leu Asp Glu Met' 
580 585 590 

Asn Arg Asn Ala Phe Ala Leu Ala Lys Glu Leu Glu Leu Ala Asn Ser 
595 600 605 

His He Ser Ser Leu Glu Asp Glu Arg Glu Val Leu Gin Lys Ser Val 
610 615 620 

Ser Glu Gin Lys Gin He Ser Gin Glu Ser Arg Glu Asn Leu Glu Asp t 
625 630 635 640 

Ala His Ser Leu Val Met Lys Leu Gly Lys Glu Arg Glu Ser Leu Glu 
645 650 655 
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Lys Arg Ala Lys Lys Leu Glu Asp Glu 
660 665 

Leu Arg Leu Arg Thr Gin Val Asn Ser 
675 680 

Glu Glu Lys Val Glu Ala Gly Glu Lys 
690 695 

Thr Arg Arg Arg Lys Thr Ala Thr Gin 
705 710 

Ser 



PCT/US00/09723 



Met Ala Ser Ala Lys Gly Glu 
670 

Val Lys Ala Pro Val Asn Asn 
685 

Ala Ala Val Thr Val Lys Arg 
700 

Pro Ala Ser Gin Gin Glu Ser 
715 720 



<210> 3 
<211> 1199 
<212> DNA 

<213> Nicotiana tabacum 
<400> 3 

cgagatgtga atcagaagac aacctgaatg tgcaattaaa ttctttgctc gttgagagag 60 
atgaatctaa aaaagagctt gatgctattc aaaaggaata cagcgagttc aagtccattt 120 
cagagaagag agtggcttca gatgccaagc tgttggggga acaagaaaag agactacacc 180 
agctcgagga acaacttggt actgccgtaa gtgaagtaag aaaaaataaa gtgctaattg 240 
ctaatttgac tcaagcaaaa gaaaacctaa ggagaatgct ggacgctgag ctggaaaatg 300 
taagcaagtt gaagctagag gtccaggtta ctcaggaaac tcttgagaaa tcaagaagtg 360 
aagcttctga tatagtagaa caactacagc agtcgaggca cttgtgctct aagcttgaag 420 
ctgaggtttc taagcttcag atggaattgg aggaaacaag gacattgtta cagaagaaca 480 
ttgatgagac aaaacgtggt gcagagttct tagctgcgga gctgaccact actagggagc 540 
ttctaaagaa aacaaatgaa gaaatgcaca ccatatccaa tgaactagct gctgttactg 600 
aaaatcgtga taacttacag acggagctag ttgatgtcta caagaaagca gaacgtgctg 660 
ttaatgaact gaaacaagaa aagaatattg tcgtgacatt ggagaaagag ctaacatttt 720 
tggaggctca aattacaaga gagaaagagt cacggaagaa tctggaagaa gagttggaaa 780 
gggctacaga atcacttgat gagatgaaca gaaatgcttt tgcacttgca aaggagctgg 840 
agctcgctaa ttctcgtatt tctagcctca aagacgagag agaagtgctc caaaagtctg 900 
tttctgagca gaagcaaatt tctcaagaag cccgagaaaa ccttgaagat gcccatagcc 960 
tggtgatgaa acttggcaag gaacgcgaga gtctggagaa gagagcaaag aaattggaag 1020 
atgaaatggc atcagcaaaa ggtgagattt tgcggttgcg gacacaagta aattcggtaa 1080 
aagctcctgt taacaaagag gaaaaagttg aagctgggga aaaggcaaca gtaacagtga 1140 
agagaacaac caggaggagg aagactgcta ctcctgcttc tcaacaagaa ggctcataa 1199 

<210> 4 
<211> 398 
<212> PRT 

<213> Nicotiana tabacum 
<400> 4 

Ara Cvs Glu Ser Glu Asp Asn Leu Asn Val Gin Leu Asn Ser Leu Leu 
i5 10 15 

Val Glu Arg Asp Glu Ser Lys Lys Glu Leu Asp Ala lie Gin Lys Glu 
20 25 30 

Tyr Ser Glu Phe Lys Ser lie Ser Glu Lys Arg Val Ala Ser Asp Ala 
35 40 45 

Lys Leu Leu Gly Glu Gin Glu Lys Arg Leu His Gin Leu Glu Glu Gin 
50 55 60 

Leu Gly Thr Ala Val Ser Glu Val Arg Lys Asn Lys Val Leu lie Ala 
65 70 75 80 
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Asn Leu Thr Gin Ala Lys Glu Asn Leu Arg Arg Met Leu Asp Ala Glu 
85 90 95 

Leu Glu Asn Val Ser Lys Leu Lys Leu Glu Val Gin Val Thr Gin Glu 
100 105 110 

Thr Leu Glu Lys Ser Arg Ser Glu Ala Ser Asp He Val Glu Gin Leu 
115 120 125 

Gin Gin Ser Arg His Leu Cys Ser Lys Leu Glu Ala Glu Val Ser Lys 
130 135 140 

Leu Gin Met Glu Leu Glu Glu Thr Arg Thr Leu Leu Gin Lys Asn He 
145 150 155 160 

Asp Glu Thr Lys Arg Gly Ala Glu Leu Leu Ala Ala Glu Leu Thr Thr 
165 170 175 

Thr Arg Glu Leu Leu Lys Lys Thr Asn Glu Glu Met His Thr He Ser 
180 185 190 

Asn Glu Leu Ala Ala Val Thr Glu Asn Arg Asp Asn Leu Gin Thr Glu 
195 200 205 

Leu Val Asp Val Tyr Lys Lys Ala Glu Arg Ala Val Asn Glu Leu Lys 
210 215 220 

Gin Glu Lys Asn He Val Val Thr Leu Glu Lys Glu Leu Thr Phe Leu 
225 230 235 240 

Glu Ala Gin He Thr Arg Glu Lys Glu Ser Pro Lys Asn Leu Glu Glu 
245 250 255 

Glu Leu Glu Arg Ala Thr Glu Ser Leu Asp Glu Met Asn Arg Asn Ala 
260 265 270 

Phe Ala Leu Ala Lys Glu Leu Glu Leu Ala Asn Ser Arg He Ser Ser 
275 280 285 

Leu Lys Asp Glu Arg Glu Val Leu Gin Lys Ser Val Ser Glu Gin Lys 
290 295 300 

Gin He Ser Gin Glu Ala Arg Glu Asn Leu Glu Asp Ala His Ser Leu 
305 310 315 320 

Val Met Lys Leu Gly Lys Glu Arg Glu Ser Leu Glu Lys Arg Ala Lys 
325 330 335 

Lys Leu Glu Asp Glu Met Ala Ser Ala Lys Gly Glu He Leu Arg Leu 
340 345 350 

Arg Thr Gin Val Asn Ser Val Lys Ala Pro Val Asn Lys Glu Glu Lys 
355 360 365 

Val Glu Ala Gly Glu Lys Ala Thr Val Thr Val Lys Arg Thr Thr Arg 
370 375 380 

Arg Arg Lys Thr Ala Thr Pro Ala Ser Gin Gin Glu Gly Ser 
385 390 395 

<210> 5 

<211> 588 

<212> DNA 

<213> Lycopersicon esculentum 
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<400> 5 c. 

agagcttaaa gagaagcttg atttgattca agttcttgaa gaaaagatta ctttgcttac 60 
tacagagatc aaagataaag aggtgagtct tcggagtaac acctctaaac tagctgaaaa 120 
agaatcggag gtaaatagtt tgagcgatat gtatcaacaa tcccaggatc agctgatgaa 180 
tttgacttca gagatcaaag aacttaaaga tgaaatccag aaaagagaga gagaactgga 240 
gttgaaatgt gtatcagaag acaacctgaa tgtgcaatta aattctttgc tcctcgagag 300 
agatgaatct aaaaaagagc ttcatgctat tcaaaaggaa tacagtgagt tcaagtccaa 360 
ttctgatgag aaggtggctt cagatgcgaa gctgttgggg gaacaagaga agagactaca 420 
ccagcttgag gaacaacttg gcactgcctt aagtgaagca agtaaaaatg aagtgctaat 480 
tgctgatctg actcgagaaa aagaaaacct taggagaatg gtggatgctg agctggacaa 540 
tgtaaacaag ttaaagcaag agattgaagt cactcaggaa agtcttga 1 588 

<210> 6 

<211> 195 

<212> PRT 

<213> Lycopersicon esculentum 

<400> 6 

Glu Leu Lys Glu Lys Leu Asp Leu lie Gin Val Leu Glu Glu Lys lie 
1 5 . 10 15 

Thr Leu Leu Thr Thr Glu lie Lys Asp Lys Glu Val Ser Leu Arg Ser 
20 25 30 

Asn Thr Ser Lys Leu Ala Glu Lys Glu Ser Glu Val Asn Ser Leu Ser 
35 40 45 

Asp Met Tyr Gin Gin Ser Gin Asp Gin Leu Met Asn Leu Thr Ser Glu 
50 55 60 

lie Lys Glu Leu Lys Asp Glu lie Gin Lys Arg Glu Arg Glu Leu Glu 
65 70 75 80 

Leu Lvs Cys Val Ser Glu Asp Asn Leu Asn Val Gin Leu Asn Ser Leu 
85 90 95 

Leu Leu Glu Arg Asp <Glu Ser Lys Lys Glu Leu His Ala lie Gin Lys 
100 105 110 

Glu Tyr Ser Glu Phe Lys Ser Asn Ser Asp Glu Lys Val Ala Ser Asp 
115 120 125 

Ala Lys Leu Leu Gly Glu Gin Glu Lys Arg Leu His Gin Leu Glu Glu 
130 135 140 

Gin Leu Gly Thr Ala Leu Ser Glu Ala Ser Lys Asn Glu Val Leu lie 
145 150 155 160 

Ala Asp Leu Thr Arg Glu Lys Glu Asn Leu Arg Arg Met Val Asp Ala 
165 170 175 

Glu Leu Asp Asn Val Asn Lys Leu Lys Gin Glu He Glu Val Thr Gin 
180 185 190 

Glu Ser Leu 
195 



<210> 7 

<211> 662 

<212> DNA 

<213> Lycopersicon esculentum 
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<400> 7 

gaccactact aaggagcttc taaagaaaac aaatgaagaa atgcacacta tgtcagatga 60 

accgtgatag cttacagaca gagctagttg atgtctataa gaaagcagaa catactgcta 120 

atgaactgaa acaagaaaag agcattgttg caacactaga agaagagtta aaatttctgg 180 

agtctcaaat tacacgagag aaagagttac ggaagagtct ggaagacgag ttagaaaagg 24 0 

ctacagaatc tcttgatgag attaaccgaa atgtgttggc acttgcagag gagctggagc 300 

ttgctacttc tcgtaattct agcctcgaag acgagagaga agtgctccga cagtctgttt 360 

ctgagcagaa gcaaatttca caagaagccc aagaaaatct ggaagacgcc catagcctgg 420 

tgatgaaact tggcaaggaa cgcgaaagtc ttgagaagag agcaaagaaa ttggaagatg 4 80 

aaatggcagc agcaaaaggt gagattttgc ggctacggag ccaaataaac tcagtaaaag 54 0 

ctccagtgga ggatgaggaa aaagttgttg ctggggaaaa ggaaaaggtg aaggcaacag 600 

taacagcaaa gaaaactacc aggagaagga agagtgctac tgttaagcaa gaggaaccct 660 

ag 662 

<210> 8 
<211> 226 
<212> PRT 

<213> Lycopersicon esculentum 
<400> 8 

Thr Thr Thr Lys Glu Leu Leu Lys Lys Thr Asn Glu Glu Met His Thr 
15 10 15 

Met Ser Asp Glu Leu Val Ala Val Ser Glu Asn Arg Asp Ser Leu Gin 
20 25 30 

Thr Glu Leu Val Asp Val Tyr Lys Lys Ala Glu His Thr Ala Asn Glu 
35 40 45 

Leu Lys Gin Glu Lys Ser He Val Ala Thr Leu Glu Glu Glu Leu Lys 
50 55 60 

Phe Leu Glu Ser Gin lie Thr Arg Glu Lys Glu Leu Arg Lys Ser Leu 
65 70 75 80 

Glu Asp Glu Leu Glu Lys Ala Thr Glu Ser Leu Asp Glu He Asn Arg 
85 90 95 

Asn Val Leu Ala Leu Ala Glu Glu Leu Glu Leu Ala Thr Ser Arg Asn 
100 105 110 

Ser Ser Leu Glu Asp Glu Arg Glu Val Leu Arg Gin Ser Val Ser Glu 
115 120 125 

Gin Lys Gin He Ser Gin Glu Ala Gin Glu Asn Leu Glu Asp Ala His 
130 135 140 

Ser Leu Val Met Lys Leu Gly Lys Glu Arg Glu Ser Leu Glu Lys Arg 
145 150 155 160 

Ala Lys Lys Leu Glu Asp Glu Met Ala Ala Ala Lys Gly Glu lie Leu 
165 170 175 

Arg Leu Arg Ser Gin lie Asn Ser Val Lys Ala Pro Val Glu Asp Glu 
180 185 190 

Glu Lys Val Val Ala Gly Glu Lys Glu Lys Val Lys Ala Thr Val Thr 
195 200 205 

Ala Lys Lys Thr Thr Arg Arg Arg Lys Ser Ala Thr Val Lys Gin Glu 
210 215 220 

Glu Pro 
225 
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<210> 9 

<211> 1694 

<212> DNA 

<213> Lycopersicon esculentum 



<400> 9 

gaagagctta aagagaagct tgatttgatt 
actacagaga tcaaagataa agaggtgagt 
aaagaatcgg aggtaaatag tttgagcgat 
aatttgactt cagagatcaa agaacttaaa 
gagttgaaat gtgtatcaga agacaacctg 
agagatgaat ctaaaaaaga gcttcatgct 
aattctgatg agaaggtggc ttcagatgcg 
caccagcttg aggaacaact tggcactgcc 
attgctgatc tgactcgaga aaaagaaaac 
aatgtaaaca agttaaagca agagattgaa 
agtgaagttt ctgatataac agtacaacta 
gaagctgagg tttctaaact tcagatggaa 
aacattgatg aaacaaaaca cagttcagag 
gagcttctaa agaaaacaaa tgaagaaatg 
tctgaaaatc gtgatagctt acagacagag 
actgctaatg aactgaaaca agaaaagagc 
tttctggagt ctcaaattac acgagagaaa 
gaaaaggcta cagaatctct tgatgagatt 
ctggagcttg ctacttctcg taattctagc 
tctgtttctg agcagaagca aatttcacaa 
agcctggtga tgaaacttgg caaggaacgc 
gaagatgaaa tggcagcagc aaaaggtgag 
gtaaaagctc cagtggagga tgaggaaaaa 
gcaacagtaa cagcaaagaa aactaccagg 
gaaccctagt tggctgtttc tgaatgacat 
tgtttgcaat atttatagag aggccagaat 
tt.gtctcttt gagtgtacat ttcccggcga 
gatattcagt caatgttgca gcttactgaa 
aaaaaaaaaa aaaa 



caagttcttg aagaaaagat tactttgctt 60 

cttcggagta acacctctaa actagctgaa 120 

atgtatcaac aatcccagga tcagctgatg 180 

gatgaaatcc agaaaagaga gagagaactg 240 

aatgtgcaat taaattcttt gctcctcgag 300 

attcaaaagg aatacagtga gttcaagtcc 360 

aagctgttgg gggaacaaga gaagagacta 420 

ttaagtgaag caagtaaaaa tgaagtgcta 4 80 

cttaggagaa tggtggatgc tgagctggac 540 
gtcactcagg aaagtcttga gaattcaaga 600 
gagcagttga gggatctttg ctccaaactt 660 

ttggaggaaa caagggcatc attacagagg 720 
ctcttagctg ctgagttgac cactactaag 780 
cacactatgt cagatgaact agtagctgtt 840 
ctagttgatg tctataagaa agcagaacat 900 
attgttgcaa cactagaaga agagttaaaa 960 

gagttacgga agagtctgga agacgagtta 1020 

aaccgaaatg tgttggcact tgcagaggag 1080 

ctcgaagacg agagagaagt gctccgacag 1140 

gaagcccaag aaaatctgga agacgcccat 1200 

gaaagtcttg agaagagagc aaagaaattg 1260 

attttgcggc tacggagcca aataaactca 1320 

gttgttgctg gggaaaagga aaaggtgaag 1380 

agaaggaaga gtgctactgt taagcaagag 1440 

aatcttcttc tttttttgtc ctgactcatt 1500 

taggacattg ccattggaac aagctgtgta 1560 

gaagttgcag aaacaaatga ctgatctctt 1620 

tgaaattatt tgtattgtaa aaaaaaaaaa 1680 

1694 



<210> 10 

<211> 1009 

<212> DNA 

<213> Lycopersicon esculentum 



<400> 10 

taataatggc 

cccaatttac 

cggttatggc 

cgattctatt 

aaggcttgtc 

aaggaagtgc 

gcgtgcttgg 

ttgaatctat 

aatttgagtc 

gtgaagagcg 

ttggtcagga 

gccttcaaaa 

aagagaagct 

tcaaagataa 

aggtaaatag 

cagagatcaa 

gtgtatcaga 



aacttcttgt 
acctttgctt 
gagtatgcac 
tgtgggattc 
aacagattct 
agggaatccc 
ttctctttat 
gaaaaataag 
cgaattgctg 
gcaagctttg 
gctgcaaaac 
tgacctcatg 
tgatttgatt 
agaggtgagt 
tttgagcgat 
agaacttaaa 
agacaacctg 



tttcctccat 
tcttgcccaa 
tcggaaaatc 
tcagttcttc 
caagcacagc 
ttcgtttctc 
gccttggctc 
ctgaaggaca 
agcgaaaggg 
gttaatcagt 
gaaaaaaaac 
aatacgaagg 
caagttcttg 
cttcggagta 
atgtatcaac 
gatgaaatcc 
aatgtgcaat 



tttctgcttc 
gaaataccca 
aaaaggaaag 
cacttctcaa 
cgcagaaaga 
tacttaatgg 
gaaatgagaa 
aggaagatgc 
aagat cgaaa 
taaaatcagc 
ttgctgaaga 
aggataagaa 
aagaaaagat 
acacctctaa 
aatcccagga 
agaaaagaga 
taaattcttt 



atcttcttca 
aatatgtaga 
taatgtctgc 
tttgagggca 
ggaaaccgag 
acttggtgtt 
ggcagtttca 
atttgtttca 
taagctaatt 
gaagactaca 
tctcaaattt 
gaaattgcag 
tactttgctt 
actagctgaa 
tcagctgatg 
gagagaactg 
gctcctcga 



ttatgttctt 
aagaagagac 
aacagaagat 
agagctctcg 
caaacaatcc 
gttggttcag 
gatgcaacca 
atgaagaagc 
aggcgagaag 
gtaataagcc 
gagatcaagg 
gaagagctta 
actacagaga 
aaagaatcgg 
aatttgactt 
gagttgaaat 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1009 



<210> 11 
<211> 1103 
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<212> DNA 

<213> Nicotiana tabacum 



<400> 11 

cttgagaaat 

ctttgctcta 

acattgttac 

ctgaccacta 

gaactagctg 

aagaaagcag 

gagaaagagc 

ctggaagaag 

gcacttgcaa 

gaagtgctcc 

cttgaagatg 

agagcaaaga 

acacaagtaa 

aaggcaacag 

caacaagaag 

gactcatatt 

atattgcctt 

agaagttgcc 

ttgatatcaa 



caagaagtga 
agcttgaagc 
agaagaacat 
ctagggagct 
ctgttactga 
aacgtgctgt 
taacattttt 
agttggaaag 
aggagctgga 
aaaagtctgt 
cccatagcct 
aattggaaga 
attcggtaaa 
taacagtgaa 
gctcataatt 
aattgcaacg 
tgtaagaaac 
caaataaatg 
aaaaaaaaaa 



agcttctgat 
tgaggtttct 
tgatgagaca 
tctaaagaaa 
aaatcgtgat 
taatgaactg 
ggaggctcaa 
ggctacagaa 
gctcgctaat 
ttctgagcag 
ggtgatgaaa 
tgaaatggca 
agctcctgtt 
gagaacaacc 
tgctgtttct 
agggtagatt 
tttctgcaag 
agatattatt 
aaa 



atagtagaac 
aagcttcaga 
aaacgtggtg 
acaaatgaag 
aacttacaga 
aaacaagaaa 
attacaagag 
tcacttgatg 
tctcgtattt 
aagcaaattt 
cttggcaagg 
tcagcaaaag 
aacaaagagg 
aggaggagga 
gaagtgacat 
attggttcat 
ctgtattctc 
gttgcaagta 



aactacagca 
tggaattgga 
cagagttctt 
aaatgcacac 
cggagctagt 
agaatattgt 
agaaagagtc 
agatgaacag 
ctagcctcaa 
ctcaagaagc 
aacgcgagag 
gtgagatttt 
aaaaagttga 
agactgctac 
atatccttcc 
tatataaaac 
agtgagtaaa 
ccaaatttgg 



gtcgaggcat 
ggaaacaagg 
agctgcggag 
catatccaat 
tgatgtctac 
cgtgacattg 
acggaagaat 
aaatgctttt 
agacgagaga 
ccgagaaaac 
tctggagaag 
gcggttgcgg 
agctggggaa 
tcctgcttct 
ttttttcctt 
cagaatgagg 
tttccaggcg 
aagggattgt 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1103 



<210> 12 

<211> 912 

<212> DNA 

<213> Nicotiana tabacum 



<400> 12 

atgagacaaa 

taaagaaaac 

attgtgataa 

atgaactgaa 

aggctcaaat 

ctacggaatc 

ttgctaattc 

ctgagcagaa 

taatgaaact 

aaatggcatc 

ctcctgttaa 

gaaccaggag 

ctgttctaaa 

tagattaatg 

ctgcaagctg 

aaaaaaaaaa 



acgtggtgca 
aaatgaagaa 
cttacagacg 
acaagaaaag 
tacaagagag 
acttgatgag 
tcatatttct 
acaaatttct 
tggcaaggaa 
agcaaaaggt 
caatgaggaa 
gaggaagact 
gtgacatatc 
gtgtattata 
tattctcagt 
aa 



gagctcttag 
atgcacacta 
gagctagttg 
aatattgtcg 
aaagagtclc 
atgaaccgaa 
agcctcgagg 
caagaatccc 
cgcgagagtc 
gagattttgc 
aaagttgaag 
gctactcagc 
tttccttttt 
gagaagccag 
gagtgtatat 



ctgcggagct 
tgtctcatga 
atgtctacaa 
tgacactgga 
ggaagaatct 
atgcttttgc 
atgagagaga 
gagaaaacct 
tggagaagag 
ggctgcggac 
ctggggaaaa 
ctgcttctca 
gtccttgact 
aattaggata 
ttccaggtga 



gaccactact 
actagcggct 
gaaagcagaa 
gaaagagcta 
ggaagaagag 
acttgcaaag 
agtgctccaa 
tgaagatgcc 
agcaaagaaa 
ccaagtaaat 
ggcagctgta 
gcaagaaagc 
caaattgatt 
ttgcccttgt 
gaagttgcac 



agggagcttc 
gttactgaaa 
cgtgctgctg 
acatttttgg 
ctggaaaggg 
gagcttgagc 
aagtctgttt 
catagcctgg 
ttggaagatg 
tcggtaaaag 
acagtgaaga 
tcatagtttg 
gcgacgagaa 
aagaaacttc 
aaacaaaaaa 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
912 



<210> 13 

<211> 905 

<212> DNA 

<213> Nicotiana tabacum 



<400> 13 

cgagatgtga 

atgaatctaa 

cagagaagag 

agctcgagga 

ctaatttgac 

taagcaagtt 

aagcttctga 

ctgaggtttc 

ttgatgagac 

t tctaaagaa 

aaaatcgtga 



atcagaagac 
aaaagagctt 
agtggcttca 
acaacttggt 
tcaagcaaaa 
gaagctagag 
tatagtagaa 
taagcttcag 
aaaacgtggt 
aacaaatgaa 
taacttacag 



aacctgaatg 
gatgctattc 
gatgccaagc 
actgccgtaa 
gaaaacctaa 
gtccaggtta 
caactacagc 
atggaattgg 
gcagagctct 
gaaatgcaca 
acggagctag 



tgcaattaaa 
aaaaggaata 
tgttggggga 
gtgaagtaag 
ggagaatgct 
ctcaggaaac 
agtcgaggca 
aggaaacaag 
tagctgcgga 
ccatatccaa 
ttgatgtcta 



ttctttgctc 
cagcgagttc 
acaagaaaag 
aaaaaataaa 
ggacgctgag 
tcttgagaaa 
tctttgctct 
gacattgtta 
gctgaccact 
tgaactagct 
caagaaagca 



gttgagagag 
aagtccattt 
agactacacc 
gtgctaattg 
ctggaaaatg 
tcaagaagtg 
aagcttgaag 
cagaagaaca 
actagggagc 
gctgttactg 
gaacgtgctg 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
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ttaatgaact gaaacaagaa aagaatattg tcgtgacatt ggagaaagag ctaacatttt 720 

tggaggctca aattacaaga gagaaagagt caccgaagaa tctggaagaa gagttggaaa 780 

gggctagctc gcttaagtac aggagatgga gaatccaccg aagaatgaag tagtggcaga 840 

tcatctgcgt ccaagcaagt tacttcacca acagaaaact tggatttgta cctgcctgct 900 

ctccg 905 



<210> 14 

<211> 1597 

<212> DNA 

<213> Nicotiana tabacum 



<400> 14 
cggcctctga 
aactttgtaa 
tcttcatcaa 
aaatgtaaaa 
gaattctgca 
ttgagggcaa 
aaagaggaga 
tggacttgga 
. gaaggccgtt 
cacattcgtt 
aataatcaac 
gcaaagagta 
gagctcatag 
aaaaaattgc 
ataactttac 
aaactagctg 
gatcagctga 
gagagagaac 
ttgctcgttg 
gagttcaagt 
gaaaagagac 
aataatgtgc 
gctgagctgg 
gagaaatcta 
tgctctaagc 
tcattacgga 
accactacta 



aatcttcttc 
tggggagttc 
tatcttcttc 
agaaaatgcc 
gcagaagaac 
atgcttttga 
cgagcaacaa 
gtttttggtt 
tctgatgcaa 
tcatggagaa 
ttaagagggc 
cagtaactaa 
ttcagatcga 
aggaggagct 
ttactacaga 
aaaaagaatc 
tgaatttgac 
tagagttgaa 
agagagatga 
ccatttcaga 
tacaccagct 
taatcgctga 
aaaacataag 
gaagtgatgc 
ttgaagctga 
ggaacattga 

gggagcttct 



tttttatcac 
ttgttttccc 
ccaatttaca 
agctatggca 
gattcttttc 
aggcttgtca 
tccaaggaaa 
caggcgtgct 
ccattgaatc 
gaaattccag 
aggcgaagaa 
ccttggtcag 
gggccttcaa 
taaagagaag 
gatcaaagat 
agaggtagat 
ttcagaaatc 
acgtgaatca 
atctaaaaaa 
gaagaaagtg 
cgaggaacaa 
tctgactcaa 
caagttgaag 
ttctgatata 
ggtttctaaa 
tgagacaaaa 
aaagaaaaaa 



tttcggagtg 
caatctccac 
cccttgcttt 
tgtatacact 
gtgggtttct 
gtagattctc 
tgcagagaat 
tggttctctt 
tatgaaaaat 
tctgagctgc 
cggcaagctc 
gagctgcaaa 
aataacctca 
cttgatttga 
aaagaggcat 
aaattgagct 
aaagaactta 
gaagacaacc 
gagcttgatg 
gcttctgatg 
cttggcactg 
gaaaaagaaa 
ctagaggtcc 
gcacaacaac 
cttcagatgg 
cgtggtgcag 
aaaaaag 



gaaatcggga 
tctctcattc 
tttccccaag 
cggagaatca 
ctgttcttcc 
aagtaaaagc 
cccttctttt 
tatgccttgg 
aagctgaagg 
tgaacgaaag 
tggttaacca 
aagaaaaacg 
tgcagatgaa 
tacaagttct 
ctcttcagag 
caatgtatca 
aagtcgaagt 
ttaatgtgcg 
ctattcaaaa 
ccaagctgtt 
cctcagatga 
acttaaggag 
aggttactca 
tacagcagtc 
aattggagga 
agctcttagc 



gaaaccaacc 
tctcttttct 
aaatgcgcaa 
aaaggaaagc 
acttctcagc 
acagccgcag 
ctctacttaa 
ctcgaaacga 
agaaagaagc 
ggatatacga 
attgaattca 
aattgctgaa 
ggaggataag 
gcaagaaaag 
tacaacctct 
ggaatcccag 
ccagaaaaga 
attaaattct 
ggaatacagc 
gggggaacaa 
agtacgcaaa 
aatgctggac 
ggaaactctt 
gaggcatctt 
aacaagaaca 
tgcggagctg 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 
1500 
1560 
1597 



<210> 15 

<211> 564 

<212> DNA 

<213> Nicotiana tabacum 



<400> 15 

gaggaacaac 

ctgactcaag 

aagttgaagc 

tctgatatag 

gtttctaaac 

gagacaaaac 

aagaaaacaa 

tgtgataact 

gaactgaaac 

gctcaaatta 



ttggcactgc 
aaaaagaaaa 
tagaggtcca 
cacaacaact 
ttcagatgga 
gtggtgcaga 
atgaagaaat 
tacagacgga 
aagaaaagaa 
caagagagaa 



ctcagatgaa 
cttaaggaga 
ggttactcag 
acagcagtcg 
attggaggaa 
gctcttagct 
gcacactatg 
gctagttgat 
tattgtcgtg 
agag 



<210> 16 

<211> 2154 

<212> DNA 

<213> Lycopersicon esculentum 



gtacgcaaaa 
atgctggacg 
gaaactcttg 
aggcatcttt 
acaagaacat 
gcggagctga 
tctcatgaac 
gtctacaaga 
acactggaga 



ataatgtgct 
ctgagctgga 
agaaatctag 
gctctaagct 
cattacggag 
ccactactag 
tagcggctgt 
aagcagaacg 
aagagctaac 



aatcgctgat 
aaacataagc 
aagtgatgct 
tgaagctgag 
gaacattgat 
ggagcttcta 
tactgaaaat 
tgctgctgat 
atttttggag 



60 
120 
180 
240 
300 
360 
420 
480 
540 
564 
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<400> 16 

atggcaactt cttgttttcc tccattttct gcttcatctt cttcattatg ttcttcccaa 60 

tttacacctt tgctttcttg cccaagaaat acccaaatat gtagaaagaa gagaccggtt 120 

atggcgagta tgcactcgga aaatcaaaag gaaagtaatg tctgcaacag aagatcgatt 180 

ctatttgtgg gattctcagt tcttccactt ctcaatttga gggcaagagc tctcgaaggc 24 0 

ttgtcaacag attctcaagc acagccgcag aaagaggaaa ccgagcaaac aatccaagga 300 

agtgcaggga atcccttcgt ttctctactt aatggacttg gtgttgttgg ttcaggcgtg 360 

cttggttctc tttatgcctt ggctcgaaat gagaaggcag tttcagatgc aaccattgaa 420 

tctatgaaaa ataagctgaa ggacaaggaa gatgcatttg tttcaatgaa gaagcaattt 480 

gagtccgaat tgctgagcga aagggaagat cgaaataagc taattaggcg agaaggtgaa 540' 

gagcggcaag ctttggttaa tcagttaaaa tcagcgaaga ctacagtaat aagccttggt 600 

caggagctgc aaaacgaaaa aaaacttgct gaagatctca aatttgagat caagggcctt 660 

caaaatgacc tcatgaatac gaaggaggat aagaagaaat tgcaggaaga gcttaaagag 720 

aagcttgatt tgattcaagt tcttgaagaa aagattactt tgcttactac agagatcaaa 780 

gataaagagg tgagtcttcg gagtaacacc tctaaactag ctgaaaaaga atcggaggta 84 0 

aatagtttga gcgatatgta tcaacaatcc caggatcagc tgatgaattt gacttcagag 900 

atcaaagaac ttaaagatga aatccagaaa agagagagag aactggagtt gaaatgtgta 960 

tcagaagaca acctgaatgt gcaattaaat tctttgctcc tcgagagaga tgaatctaaa 1020 

aaagagcttc atgctattca aaaggaatac agtgagttca agtccaattc tgatgagaag 1080 

gtggcttcag atgcgaagct gttgggggaa caagagaaga gactacacca gcttgaggaa 114 0 

caacttggca ctgccttaag tgaagcaagt aaaaatgaag tgctaattgc tgatctgact 1200 

cgagaaaaag aaaaccttag' gagaatggtg gatgctgagc tggacaatgt aaacaagtta 1260 

aagcaagaga ttgaagtcac tcaggaaagt cttgagaatt caagaagtga agtttctgat 1320 

ataacagtac aactagagca gttgagggat ctttgctcca aacttgaagc tgaggtttct 1380 

aaacttcaga tggaattgga ggaaacaagg gcatcattac agaggaacat tgatgaaaca 14 4 0 

aaacacagtt cagagctctt agctgctgag ttgaccacta ctaaggagct tctaaagaaa 1500 

acaaatgaag aaatgcacac tatgtcagat gaactagtag ctgtttctga aaatcgtgat 1560 

agcttacaga cagagctagt tgatgtctat aagaaagcag aacatactgc taatgaactg 1620 

aaacaagaaa agagcattgt tgcaacacta gaagaagagt taaaatttct ggagtctcaa 1680 

attacacgag agaaagagtt acggaagagt ctggaagacg agttagaaaa ggctacagaa 174 0 

tctcttgatg agattaaccg aaatgtgttg gcacttgcag aggagctgga gcttgctact 1800 

tctcgtaatt ctagcctcga agacgagaga gaagtgctcc gacagtctgt ttctgagcag 1860 

aagcaaattt cacaagaagc ccaagaaaat ctggaagacg cccatagcct ggtgatgaaa 1920 

cttggcaagg aacgcgaaag tcttgagaag agagcaaaga aattggaaga tgaaatggca 1980 

gcagcaaaag gtgagatttt gcggctacgg agccaaataa actcagtaaa agctccagtg 2040 

gaggatgagg aaaaagttgt tgctggggaa aaggaaaagg tgaaggcaac agtaacagca 2100 

aagaaaacta ccaggagaag gaagagtgct actgttaagc aagaggaacc ctag 2154 

<210> 17 
<211> 717 
<212> PRT 

<213> Lycopersicon esculentum 
<400> 17 

Met Ala Thr Ser Cys Phe Pro Pro Phe Ser Ala Ser Ser Ser Ser Leu 
15 10 15 

Cys Ser Ser Gin Phe Thr Pro Leu Leu Ser Cys Pro Arg Asn Thr Gin 

20 25 30 

lie Cys Arg Lys Lys Arg Pro Val Met Ala Ser Met His Ser Glu Asn 
35 40 45 

Gin Lys Glu Ser Asn Val Cys Asn Arg Arg Ser He Leu Phe Val Gly 
50 55 60 

Phe Ser Val Leu Pro Leu Leu Asn Leu Arg Ala Arg Ala Leu Glu Gly 
65 70 75 80 

Leu Ser Thr Asp Ser Gin Ala Gin Pro Gin Lys Glu Glu Thr Glu Gin 
85 90 95 

Thr He Gin Gly Ser Ala Gly Asn Pro Phe Val Ser Leu Leu Asn Gly 
100 105 110 
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Leu Gly Val Val Gly Ser Gly Val Leu Gly Ser Leu Tyr Ala Leu Ala 
115 120 125 

Arg Asn Glu Lys Ala Val Ser Asp Ala Thr lie Glu Ser Met Lys Asn 
130 135 140 

Lys Leu Lys Asp Lys Glu Asp Ala Phe Val Ser Met Lys Lys Gin Phe 
145 150 155 160 

Glu Ser Glu Leu Leu Ser Glu Arg Glu Asp Arg Asn Lys Leu lie Arg 
165 170 175 

Arg Glu Gly Glu Glu Arg Gin Ala Leu Val Asn Gin Leu Lys Ser Ala 
180 185 190 

Lys Thr Thr Val lie Ser Leu Gly Gin Glu Leu Gin Asn Glu Lys Lys 
195 200 205 

Leu Ala Glu Asp Leu Lys Phe Glu lie Lys Gly Leu Gin Asn Asp Leu 
210 215 220 

Met Asn Thr Lys Glu Asp Lys Lys Lys Leu Gin Glu Glu Leu Lys Glu 
225 230 235 240 

Lys Leu Asp Leu lie Gin Val Leu Glu Glu Lys lie Thr Leu Leu Thr 
245 250 255 

Thr Glu lie Lys Asp Lys Glu Val Ser Leu Arg Ser Asn Thr Ser Lys 
260 265 270 

Leu Ala Glu Lys Glu Ser Glu Val Asn Ser Leu Ser Asp Met Tyr Gin 
275 280 285 

Gin Ser Gin Asp Gin Leu Met Asn Leu Thr Ser Glu lie Lys Glu Leu 
290 295 300 

Lys Asp Glu lie Gin Lys Arg Glu Arg Glu Leu Glu Leu Lys Cys Val 
305 310 315 320 

Ser Glu Asp Asn Leu Asn Val Gin Leu Asn Ser Leu Leu Leu Glu Arg 
325 330 335 

Asp Glu Ser Lys Lys Glu Leu His Ala lie Gin Lys Glu Tyr Ser Glu 
340 345 350 

Phe Lys Ser Asn Ser Asp Glu Lys Val Ala Ser Asp Ala Lys Leu Leu 
355 360 365 

Gly Glu Gin Glu Lys Arg Leu His Gin Leu Glu Glu Gin Leu Gly Thr 
370 375 380 

Ala Leu Ser Glu Ala Ser Lys Asn Glu Val Leu lie Ala Asp Leu Thr 
385 390 , 395 400 

Arg Glu Lys Glu Asn Leu Arg Arg Met Val Asp Ala Glu Leu Asp Asn 
405 410 415 

Val Asn Lys Leu Lys Gin Glu lie Glu Val Thr Gin Glu Ser Leu Glu 
420 425 430 

Asn Ser Arg Ser Glu Val Ser Asp lie Thr Val Gin Leu Glu Gin Leu 
435 440 445 



12 



WO 00/61615 



PCT/USOQ/09723 



Ara aso Leu Cvs Ser Lys Leu Glu Ala Glu Val Ser Lys Leu Gin Met 
9 450 455 460 

Glu Leu Glu Glu Thr Arg Ala Ser Leu Gin Arg Asn He Asp Glu Thr 
465 470 475 480 

Lvs His Ser Ser Glu Leu Leu Ala Ala Glu Leu Thr Thr Thr Lys Glu 
y 485 490 495 

Leu Leu Lys Lys Thr Asn Glu Glu Met His Thr Met Ser Asp Glu Leu 
500 505 510 

Val Ala Val Ser Glu Asn Arg Asp Ser Leu Gin Thr Glu Leu Val Asp 
515 520 525 

Val Tyr Lys Lys Ala Glu His Thr Ala Asn Glu Leu Lys Gin Glu Lys 
530 535 540 

Ser He Val Ala Thr Leu Glu Glu Glu Leu Lys Phe Leu Glu Ser Gin 
545 550 555 560 

lie Thr Arg Glu Lys Glu Leu Arg Lys Ser Leu Glu Asp Glu Leu Glu 
565 570 575 

Lys Ala Thr Glu Ser Leu Asp Glu He Asn Arg Asn Val Leu Ala Leu 
580 585 590 

Ala Glu Glu Leu Glu Leu Ala Thr Ser Arg Asn Ser Ser Leu Glu Asp 
595 600 605 

Glu Arg Glu Val Leu Arg Gin Ser Val Ser Glu Gin Lys Gin He Ser 
610 615 620 

Gin Glu Ala Gin Glu Asn Leu Glu Asp Ala His Ser Leu Val Met Lys 
625 630 635 640 

Leu Gly Lys Glu Arg Glu Ser Leu Glu Lys Arg Ala Lys Lys Leu Glu 
645 650 655 

Asp Glu Met Ala Ala Ala Lys Gly Glu He Leu Arg Leu Arg Ser Gin 
660 665 670 

lie Asn Ser Val Lys Ala Pro Val Glu Asp Glu Glu Lys Val Val Ala 
675 680 685 

Gly Glu Lys Glu Lys Val Lys Ala Thr Val Thr Ala Lys Lys Thr Thr 
690 695 700 

Arg Arg Arg Lys Ser Ala Thr Val Lys Gin Glu Glu Pro 
705 710 715 

<210> 18 
<211> 407 
<212> DNA 

<213> Nicotiana tabacum 
<400> 18 

tcgaggaaca acttggcact gcctcagatg aagtacgcaa aaataatgtg ctaatcgctg 60 
atctgactca agaaaaagaa aacttaagga gaatgctgga cgctgagctg gaaaacataa 120 
gcaagttgaa gctagaggtc caggttactc aggaaactct tgagaaatct agaagtgatg 180 
cttctgatat agcacaacaa ctacagcagt cgaggcatct ttgctctaag cttgaagctg 240 
aggtttctaa acttcagatg gaattggagg aaacaagaac atcattacgg aggaacattg 300 
atgagacaaa acgtggtgca gagctcttag ctgcggagct gaccactact agggagcttc 360 
taaagaaaaa aaaaaaagga attcctgcag cccgggggat ccactag 407 
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<210> 19 

<211> 1491 

<212> DNA 

<213> Glycine max 



<400> 19 

gtgatgtcat 

tggactctaa 

aacttgccct 

gagatgattt 

aactccaagt 

accttctaac 

catctgagct 

aggcagaaat 

cagagctgca 

aaaaagaatt 

aaaaacagtt 

aagacaagga 

atgaaatgaa 

tttctagcct 

catgcaaaga 

aagaaagaga 

agggtgagat 

gcccagtgca 

atgagcaagc 

tcagaagaag 

attccgatta 

acttttggca 

taggaattgt 

aggtaagaat 

acttttccat 



ggagaaggaa 
ggttttaaga 
aggtgaagca 
gaaggaggct 
taccctggag 
tttgtcaaac 
cactgaggtt 
gctagcaagt 
aggttgtcaa 
agttgaagtc 
agttgcttct 
gtcccgaaaa 
ccgaaatgcg 
tgaaaaagag 
ggcccaagac 
gaatttagag 
attgcgcttg 
gaaagatgga 
acagaaagat 
aaaggctaat 
ggatcatgat 
tgcaaatatt 
taagctaagc 
actattacca 
gtctatgaag 



tacaatgatc 
gaaaaagaag 
agcaaaagcc 
ctagrataatg 
aatcttgcaa 
aaactgtgca 
aatgaatcgc 
gagcttacaa 
aagaatctga 
tacaaaaagg 
ctgaacaaag 
tctcttgaga 
gtgatccttt 
aaagatgtgc 
aacattgaag 
aaaaaaggta 
aagagtcgaa 
ggtgaaaaaa 
gaaggtgaaa 
ccacaataac 
attctgtaat 
ttcatgtttt 
tttttggaga 
accttagtct 
caaatcgaca 



taaagttcag 
aggagcttca 
agatcgtcat 
aatctagcaa 
aatcaagaaa 
aagagctcga 
tacagagaaa 
ctgccaagga 
cagctgctct 
ctgaaagcac 
atttacaagc 
gggacctgga 
ctggggaact 
ttattaagtc 
atgctcataa 
agaaatttga 
tcaattcttc 
aggtcaaccc 
acaaggttac 
agagaaatta 
aaactatttg 
gcaatagtat 
gttgatttct 
gcaacattat 
agcttgttgc 



tgctgtaaag 
tcagctaaag 
tgctgattta 
ggtgaatcat 
tgagtctgct 
gctcgaggtc 
ccttgatgat 
acacttgaag 
tgaaaagaat 
agcagaggat 
attagagcag 

ggaggcgacc 

acagagagct 
cctaaccaac 
ccttatcatg 
agaggaattg 
aaaagttgct 
ttcaaaagtt 
tgtaagtgca 
gagagttttc 
gaagccagtt 
tgacaaatta 
gatagtaaac 
acattagtgt 
caaaaaaaaa 



aaggctgctt 
gatcagtttg 
tcccaacaaa 
ttgaagcaag 
gaattggaaa 
tctaagctct 
gcgaaacatg 
gaagcacaag 
gatagcctac 
ttgaaggaac 
caagtctcaa 
atatcactag 
aattctcttg 
caaagaaatg 
aaacttggca 
gcttctgcca 
gttaacaatg 
gcggtaaaca 
cggaagactg 
tattaaaaat 
gattctattc 
aatgacactg 
ctaaaaaaaa 
atatacagct 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 
1491 



<210> 20 

<211> 388 

<212> PRT 

<213> Glycine max 

<400> 20 

Asp Val Met Glu Lys Glu Tyr Asn Asp Leu Lys 
1 5 10 



Phe Ser Ala Val Lys 
15 



Lys Ala Ala Leu Asp Ser Lys Val Leu Arg Glu 
20 25 



Lys Glu Glu Glu Leu 
30 



His Gin Leu Lys Asp Gin Phe Glu Leu Ala Leu 
35 40 



Gly Glu Ala Ser Lys 
45 



Ser Gin lie Val lie Ala Asp Leu Ser Gin Gin Arg Asp Asp Leu Lys 
50 55 60 

Glu Ala Leu Asp Asn Glu Ser Ser Lys Val Asn His Leu Lys Gin Glu 

65 70 75 80 

Leu Gin Val Thr Leu Glu Asn Leu Ala Lys Ser Arg Asn Glu Ser Ala 
85 90 95 



Glu Leu Glu Asn Leu Leu Thr Leu Ser Asn Lys Leu Cys Lys Glu Leu 
100 105 110 

Glu Leu Glu Val Ser Lys Leu Ser Ser Glu Leu Thr Glu Val Asn Glu 
115 120 125 
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Ser Leu Gin Arg Asn Leu Asp Asp Ala Lys His Glu Ala Glu Met Leu 
130 135 140 

Ala Ser Glu Leu Thr Thr Ala Lys Glu His Leu Lys Glu Ala Gin Ala 
• 145 150 155 160 

Glu Leu Gin Gly Cys Gin Lys Asn Leu Thr Ala Ala Leu Glu Lys Asn 
165 170 175 

Ast> Ser Leu Gin Lys Glu Leu Val Glu Val Tyr Lys Lys Ala Glu Ser 
K 180 185 190 

Thr Ala Glu Asp Leu Lys Glu Gin Lys Gin Leu Val Ala Ser Leu Asn 
195 200 205 

Lvs Asp Leu Gin Ala Leu Glu Gin Gin Val Ser Lys Asp Lys Glu Ser 
210 215 220 

Arq Lys Ser Leu Glu Arg Asp Leu Glu Glu Ala Thr lie Ser Leu Asp 
225 230 235 240 

Glu Met Asn Arg Asn Ala Val lie Leu Ser Gly Glu Leu Gin Arg Ala 
245 250 255 

Asn Ser Leu Val Ser Ser Leu Glu Lys Glu Lys Asp Val Leu lie Lys 
260 265 270 

Ser Leu Thr Asn Gin Arg Asn Ala Cys Lys Glu Ala Gin Asp Asn lie 
275 280 285 

Glu Asp Ala His Asn Leu lie Met Lys Leu Gly Lys Glu Arg Glu Asn 
290 295 300 

Leu Glu Lys Lys Gly Lys Lys Phe Glu Glu Glu Leu Ala Ser Ala Lys 
305 310 315 320 

Gly Glu lie Leu Arg Leu Lys Ser Arg lie Asn Ser Ser Lys Val Ala 
325 330 335 

Val Asn Asn Gly Pro Val Gin Lys Asp Gly Gly Glu Lys Lys Val Asn 
340 345 350 

Pro Ser Lys Val Ala Val Asn Asn Glu Gin Ala Gin Lys Asp Glu Gly 
355 360 365 

Glu Asn Lys Val Thr Val Ser Ala Arg Lys Thr Val Arg Arg Arg Lys 
370 375 380 

Ala Asn Pro Gin 
385 

<210> 21 
<211> 2019 
<212> DNA 
<213> Zea mays 

<400> 21 

cggacgcgtg ggcctaaatt tgaagggaca aagggtattg caaaacctga caacactcaa 



60 



cctgaaggaa ctcaggctga aactatacct gaagctcgtc agcgtgaatc atccttacag 120 



180 



ttggtgcaag aacaacctcc agagaatcca ctgcttggct ttcttggtat agttggagtt 

gctgcctctg gtgttcttgg tgggctgtac ggcacttctc tacaagaaga aaaggccctg 240 

caatcaattg tctcctcaat ggagagcaaa ttggctgaaa atgaggcagc actttcattg 300 

atgagggata attatgagaa acggttactg gagcagcaag cagcacaaaa gaagcaatct 360 

atgaagttcc aggagcagga agtttctctt tcaggtcagt tggcttcagc aacaaagact 420 

15 



WO 00/61615 



PCT/US00/09723 



, ttgacatcac 
gaaatacaga 
actaaattgg 
aaccaagaaa 
aaggaagtag 
cttgcaaatt 
gtttctaaga 
aagaagaagc 
gcttcagaga 
aaacagcttg 
attgttgagt 
tccatggaag 
agtgaggttt 
gtatcacaaa 
aaactaggag 
gaaatggttc 

. gctgaagctc 
accacacatg 
gcgttagcga 
gaggaggcaa 
cttgagagca 
gctctagctg 
aaccttatca 
gaagaggaat 
aacagttctc 
caacctgtga 
gtgaaaagga 



tgagtgaaga 
gattagagag 
aagaaaagct 
ttgatgataa 
actaccaaaa 
ctagagtaca 
tatcttctat 
tgacaaaaaa 
caagggcaag 
aggaaaaact 
tgaacaagga 
ctttaaaaga 
ccaaactttc 
tttctaaact 
aggcagaatc 
agaagggaca 
gtgacaactt 
agcttgtcga 
aacagttgca 
caaagtcact 
ctcattctag 
aacaaacgaa 
caaggcttga 
tggcgttagc 
agaaaccaag 
atgattataa 
ctgtaaggag 



attcagaaag 
tagtatcaca 
tggtgagatt 
ggagaagcac 
gctgaccgct 
acaactcgag 
tgattcactc 
aataaatgag 
ccatgattcc 
gtctgttgca 
gttggatgct 
ttcaattcga 
caaggagctt 
ccgagaggaa 
actatctaaa 
agaagaactc 
gaagaaagaa 
ggaaagaaaa 
ggttgattct 
agatgaaatg 
gagtgccact 
aatcacaacc 
gacagagaag 
aaaaggtgag 
agcaagagga 
tcagaagacc 
aagaaaaggt 



gagaagaaat 
caagctggca 
aattttttgc 
atcagggaac 
ttcacaaatc 
gaagaactaa 
aatgctaaac 
ttaatacaag 
aaactactgt 
ttaactgatt 
accaaaatga 
tcatctgaag 
gaggaggcaa 
tccaatgaaa 
gctctgtcag 
gaagccacct 
ttgctggatg 
attgtgacag 
gaagcaagaa 
aacaatagcg 
cttgaat ctg 
gaagctaagg 
gagagctttg 
atactgcgcc 
ccaccagagg 
agtggagttg 
ggcgcataa 



tagctgagga 
ttgataatga 
aggaaaaggt 
tcagtgcatc 
aaactaaaaa 
gtacaactaa 
ttgaaacctt 
agtatacaga 
cagaaagaga 
ctagcaaaga 
tgctaaagaa 
aggctctaaa 
atgaattgaa 
tgcaagtaga 
aagatttggc 
ctattgagct 
cgtacaagaa 
ccttaaacaa 
aaagtctcga 
cgctgttact 
agaaggaaat 
aaaacacaga 
aattgaggtg 
taaggaggca 
ccagtgaaac 
ttgctggaac 



acttagggat 
tgtgcttgaa 
aagtttactc 
actttcctcg 
gagccttgag 
gaacgctctc 
gaactctgaa 
cctgaaggtt 
tgatctgata 
tcaagaaaca 
tgaacttaag 
gacttcaaga 
tgaggacctg 
tctcactaat 
ttcagtaaat 
ggcatctatt 
tttggagtca 
ggaacttgaa 
atcagacctg 
gtctaaagaa 
gctacgcaag 
ggatgctcag 
tagacatctt 
gattagcaca 
tctgaaggag 
tccacagcct 



<210> 22 

<211> 672 

<212> PRT 

<213> Zea mays 

<400> 22 

Arg Thr Arg Gly Pro Lys Phe Glu Gly Thr Lys Gly He Ala Lys Pro 
15 10 15 

Asp Asn Thr Gin Pro Glu Gly Thr Gin Ala Glu Thr He Pro Glu Ala 
20 25 30 

Arg Gin Arg Glu Ser Ser Leu Gin Leu Val Gin Glu Gin Pro Pro Glu 
35 40 45 



480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 
1500 
1560 
1620 
1680 
1740 
1800 
1860 
1920 
1980 
2019 



Asn Pro Leu Leu Gly Phe Leu Gly He 
50 55 



Val Gly Val Ala 
60 



Ala Ser Gly 



Val Leu 
65 



Gly Gly Leu Tyr Gly Thr Ser 
70 



Gin Ser lie Val Ser Ser Met Glu Ser 
85 



Leu Gin Glu Glu 
75 

Lys Leu Ala Glu 
90 



Lys Ala Leu 
80 

Asn Glu Ala 
95 



Ala Leu Ser Leu Met Arg Asp Asn Tyr Glu Lys Arg Leu 
100 105 



Leu Glu Gin 
110 



Gin Ala Ala Gin Lys Lys- Gin Ser Met 
115 120 

Ser Leu Ser Gly Gin Leu Ala Ser Ala 
130 135 



Lys Phe Gin Glu 
125 

Thr Lys Thr Leu 
140 



Gin Glu Val 
Thr Ser Leu 



Ser Glu Glu Phe Arg Lys Glu Lys Lys 
145 150 



Leu Ala Glu Glu 
155 



Leu Arg Asp 
160 
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Glu lie Gin Arg Leu Glu Ser Ser lie Thr Gin Ala Gly lie Asp Asn 
165 170 175 

Asp Val Leu Glu Thr Lys Leu Glu Glu Lys Leu Gly Glu He Asn Phe 
180 185 190 

Leu Gin Glu Lys Val Ser Leu Leu Asn Gin Glu He Asp Asp Lys Glu 
195 200 205 

Lvs His He Arg Glu Leu Ser Ala Ser Leu Ser Ser Lys Glu Val Asp 
210 215 220 

Tvr Gin Lys Leu Thr Ala Phe Thr Asn Gin Thr Lys Lys Ser Leu Glu 
225 230 235 240 

Leu Ala Asn Ser Arg Val Gin Gin Leu Glu Glu Glu Leu Ser Thr Thr 
245 250 255 

Lys Asn Ala Leu Val Ser Lys He Ser Ser He Asp Ser Leu Asn Ala 
260 265 270 

Lys Leu Glu Thr Leu Asn Ser Glu Lys Lys Lys Leu Thr Lys Lys He 
275 280. 285 

Asn Glu Leu lie Gin Glu Tyr Thr Asp Leu Lys Val Ala Ser Glu Thr 
290 295 300 

Arg Ala Ser His Asp Ser Lys Leu Leu Ser Glu Arg Asp Asp Leu He 
305 310 315 320 

Lys Gin Leu Glu Glu Lys Leu Ser Val Ala Leu Thr Asp Ser Ser Lys 
325 330 335 

Asp Gin Glu Thr He Val Glu Leu Asn Lys Glu Leu Asp Ala Thr Lys 
340 345 350 

Met Met Leu Lys Asn Glu Leu Lys Ser Met Glu Ala Leu Lys Asp Ser 
355 360 365 

He Arg Ser Ser Glu Glu Ala Leu Lys Thr Ser Arg Ser Glu Val Ser 
370 375 380 

Lys Leu Ser Lys Glu Leu Glu Glu Ala Asn Glu Leu Asn Glu Asp Leu 
385 390 395 400 

Val Ser Gin He Ser Lys Leu Arg Glu Glu Ser Asn Glu Met Gin Val 
405 410 415 

Asp Leu Thr Asn Lys Leu Gly Glu Ala Glu Ser Leu Ser Lys Ala Leu 
420 425 430 

Ser Glu Asp Leu Ala Ser Val Asn Glu Met Val Gin Lys Gly Gin Glu 
435 440 445 

Glu Leu Glu Ala Thr Ser He Glu Leu Ala Ser He Ala Glu Ala Arg 
450 455 460 

Asp Asn Leu Lys Lys Glu Leu Leu Asp Ala Tyr Lys Asn Leu Glu Ser 
465 470 475 480 

Thr Thr His Glu Leu Val Glu Glu Arg Lys He Val Thr Ala Leu Asn 
485 490 495 
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Lvs Glu Leu Glu Ala Leu Ala Lys Gin Leu Gin Val Asp Ser Glu Ala 
500 505 510 

Ara Lvs Ser Leu Glu Ser Asp Leu Glu Glu Ala Thr Lys Ser Leu Asp 
515 520 525 

Glu Met Asn Asn Ser Ala Leu Leu Leu Ser Lys Glu Leu Glu Ser Thr 
530 535 540 

His Ser Arg Ser Ala Thr Leu Glu Ser Glu Lys Glu Met Leu Arg Lys 
545 550 555 560 

Ala Leu Ala Glu Gin Thr Lys He Thr Thr Glu Ala Lys Glu Asn Thr 
565 570 575 

Glu Asp Ala Gin Asn Leu He Thr Arg Leu Glu Thr Glu Lys Glu Ser 
580 585 590 

Phe Glu Leu Arg Cys Arg His Leu Glu Glu Glu Leu Ala Leu Ala Lys 
595 600 605 

Gly Glu He Leu Arg Leu Arg Arg Gin lie Ser Thr Asn Ser Ser Gin 
610 615 620 

Lvs Pro Arg Ala Arg Gly Pro Pro Glu Ala Ser Glu Thr Leu Lys Glu 
6 25 630 635 640 

Gin Pro Val Asn Asp Tyr Asn Gin Lys Thr Ser Gly Val Val Ala Gly 
645 650 655 

Thr Pro Gin Pro Val Lys Arg Thr Val Arg Arg Arg Lys Gly Gly Ala 
660 665 670 



<210> 23 

<211> 322 

<212> DNA 

<213> Oryza sativa 

<220> 

<223> n= g, a, c or t 



<400> 23 

gagagaaact agttctagga 
tcttgctgag caacagaaga 
tcttatctct aggcttcaga 
agaggagttg gcgttagcaa 
cagatcacag aaagcaaaaa 
tccangacga »gcaggctgtg 



aggacactct tgaagcagag 

tcacaactga agctcatgaa 

ctgagaagga gagttttgaa 

agggtgagat attgcgccta 

ctcttccaaa cacaaatgca 
aa 



aaaaaaatgt tatcaaaggc 60 
aacactgagg atgctcagaa 120 
atgagggcta gacatcttga 180 
agaaggcaga ttagtacaag 240 
tctccagagg tcagtcaggc 300 

322 



<210> 24 

<211> 107 

<212> PRT 

<213> Oryza alta 

<220> 

<223> X= G or R 



<400> 24 

Arg Glu Thr Ser Ser Arg Lys Asp 
1 5 

Leu Ser Lys Ala Leu Ala Glu Gin 
20 



Thr Leu Glu Ala Glu Lys Lys Met 
10 15 

Gin Lys He Thr Thr Glu Ala His 
25 30 
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Glu Asn Thr Glu Asp Ala Gin Asn Leu lie Ser Arg Leu Gin Thr Glu 
35 40 45 

Lys Glu Ser Phe Glu Met Arg Ala Arg His Leu Glu Glu Glu Leu Ala 
50 55 60 

Leu Ala Lys Gly Glu lie Leu Arg Leu Arg Arg Gin lie Ser Thr Ser 
65 70 75 80 

Arc Ser Gin Lys Ala Lys Thr Leu Pro Asn Thr Asn Ala Ser Pro Glu 
85 ^0 95 

Val Ser Gin Ala Pro Xaa Arg Ala Gly Cys Glu 
100 105 



<210> 


25 


<211> 


27 


<212> 


DNA 


<213> 


Artificial Sequence 


<220> 




<223> 


Description of Artificial 


<400> 


25 


gctctagagg aacaacttgg cactgcc 


<210> 


26 


<211> 


27 


<212> 


DNA 


<213> 


Artificial Sequence 


<220> 




<223> 


Description of Artificial 


<400> 


26 



27 



cgggatcctc ttgtaatttg agcctcc 27 



19 



